Skip to content

Commit

Permalink
Merge pull request #1 from PandaTechAM/development
Browse files Browse the repository at this point in the history
npgsqlcopy add
  • Loading branch information
HaikAsatryan authored Apr 10, 2024
2 parents 3fdc708 + 7407581 commit 54572c6
Show file tree
Hide file tree
Showing 15 changed files with 684 additions and 57 deletions.
139 changes: 105 additions & 34 deletions Readme.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,46 @@
# Pandatech.EFCore.PostgresExtensions
Pandatech.EFCore.PostgresExtensions is a NuGet package that enhances Entity Framework Core with support for PostgreSQL-specific syntax for update operations.
- [1. Pandatech.EFCore.PostgresExtensions](#1-pandatechefcorepostgresextensions)
- [1.1. Features](#11-features)
- [1.2. Installation](#12-installation)
- [1.3. Usage](#13-usage)
- [1.3.1. Row-Level Locking](#131-row-level-locking)
- [1.3.2. Npgsql COPY Integration](#132-npgsql-copy-integration)
- [1.3.2.1. Benchmarks](#1321-benchmarks)
- [1.3.2.1.1. General Benchmark Results](#13211-general-benchmark-results)
- [1.3.2.1.2. Detailed Benchmark Results](#13212-detailed-benchmark-results)
- [1.3.2.1.3. Efficiency Comparison](#13213-efficiency-comparison)
- [1.3.2.1.4. Additional Notes](#13214-additional-notes)
- [1.4. License](#14-license)

## Introduction
You can install the Pandatech.EFCore.PostgresExtensions NuGet package via the NuGet Package Manager UI or the Package Manager Console using the following command:
# 1. Pandatech.EFCore.PostgresExtensions

Pandatech.EFCore.PostgresExtensions is an advanced NuGet package designed to enhance PostgreSQL functionalities within
Entity Framework Core, leveraging specific features not covered by the official Npgsql.EntityFrameworkCore.PostgreSQL
package. This package introduces optimized row-level locking mechanisms and an efficient, typed version of the
PostgreSQL COPY operation, adhering to EF Core syntax for seamless integration into your projects.

## 1.1. Features

1. **Row-Level Locking**: Implements the PostgreSQL `FOR UPDATE` feature, providing three lock
behaviors - `Wait`, `Skip`, and
`NoWait`, to facilitate advanced transaction control and concurrency management.
2. **Npgsql COPY Integration**: Offers a high-performance, typed interface for the PostgreSQL COPY command, allowing for
bulk data operations within the EF Core framework. This feature significantly enhances data insertion speeds and
efficiency.

## 1.2. Installation

To install Pandatech.EFCore.PostgresExtensions, use the following NuGet command:

```bash
Install-Package Pandatech.EFCore.PostgresExtensions
```

## 1.3. Usage

## Features
Adds support for PostgreSQL-specific update syntax.
Simplifies handling of update operations when working with PostgreSQL databases.
### 1.3.1. Row-Level Locking

## Installation
1. Install Pandatech.EFCore.PostgresExtensions Package
```Install-Package Pandatech.EFCore.PostgresExtensions```

2. Enable Query Locks
Configure your DbContext to use Npgsql and enable query locks:

Inside the AddDbContext or AddDbContextPool method, after calling UseNpgsql(), call the UseQueryLocks() method on the DbContextOptionsBuilder to enable query locks.
```csharp
services.AddDbContext<MyDbContext>(options =>
{
Expand All @@ -24,36 +49,82 @@ services.AddDbContext<MyDbContext>(options =>
});
```

## Usage
Use the provided ForUpdate extension method on IQueryable within your application to apply PostgreSQL-specific update syntax.
Within a transaction scope, apply the desired lock behavior using the `ForUpdate` extension method:

```csharp
using Pandatech.EFCore.PostgresExtensions;
using Microsoft.EntityFrameworkCore;
using var transaction = _dbContext.Database.BeginTransaction();
try
{
var entityToUpdate = _dbContext.Entities
.Where(e => e.Id == id)
.ForUpdate(LockBehavior.NoWait) // Or use LockBehavior.Default (Wait)/ LockBehavior.SkipLocked
.FirstOrDefault();

// Inside your service or repository method
using (var transaction = _dbContext.Database.BeginTransaction())
// Perform updates on entityToUpdate
await _dbContext.SaveChangesAsync();
transaction.Commit();
}
catch (Exception ex)
{
try
{
// Use the ForUpdate extension method on IQueryable inside the transaction scope
var entityToUpdate = _dbContext.Entities
.Where(e => e.Id == id)
.ForUpdate()
.FirstOrDefault();
transaction.Rollback();
// Handle exception
}
```

// Perform updates on entityToUpdate
### 1.3.2. Npgsql COPY Integration

await _dbContext.SaveChangesAsync();
For bulk data operations, use the `BulkInsert` or `BulkInsertAsync` extension methods:

transaction.Commit();
}
catch (Exception ex)
```csharp
public async Task BulkInsertExampleAsync()
{
var users = new List<UserEntity>();
for (int i = 0; i < 10000; i++)
{
transaction.Rollback();
// Handle exception
users.Add(new UserEntity { /* Initialization */ });
}

await dbContext.Users.BulkInsertAsync(users); // Or use BulkInsert for synchronous operation
// It also saves changes to the database
}
```
## License

#### 1.3.2.1. Benchmarks

The integration of the Npgsql COPY command showcases significant performance improvements compared to traditional EF
Core and Dapper methods:

##### 1.3.2.1.1. General Benchmark Results

| Caption | Big O Notation | 1M Rows | Batch Size |
|------------|----------------|-------------|------------|
| BulkInsert | O(log n) | 350.000 r/s | No batch |
| Dapper | O(n) | 20.000 r/s | 1500 |
| EFCore | O(n) | 10.600 r/s | 1500 |

##### 1.3.2.1.2. Detailed Benchmark Results

| Operation | BulkInsert | Dapper | EF Core |
|-------------|------------|--------|---------|
| Insert 10K | 76ms | 535ms | 884ms |
| Insert 100K | 405ms | 5.47s | 8.58s |
| Insert 1M | 2.87s | 55.85s | 94.57s |

##### 1.3.2.1.3. Efficiency Comparison

| RowsCount | BulkInsert Efficiency | Dapper Efficiency |
|-----------|----------------------------|---------------------------|
| 10K | 11.63x faster than EF Core | 1.65x faster than EF Core |
| 100K | 21.17x faster than EF Core | 1.57x faster than EF Core |
| 1M | 32.95x faster than EF Core | 1.69x faster than EF Core |

##### 1.3.2.1.4. Additional Notes

- The `BulkInsert` feature currently does not support entity properties intended for `JSON` storage.

- The performance metrics provided above are based on benchmarks conducted under controlled conditions. Real-world
performance may vary based on specific use cases and configurations.

## 1.4. License

Pandatech.EFCore.PostgresExtensions is licensed under the MIT License.
Binary file added img.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 7 additions & 7 deletions src/EFCore.PostgresExtensions/EFCore.PostgresExtensions.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,22 @@
<PackageReadmeFile>Readme.md</PackageReadmeFile>
<Authors>Pandatech</Authors>
<Copyright>MIT</Copyright>
<Version>1.0.0</Version>
<Version>2.0.0</Version>
<PackageId>Pandatech.EFCore.PostgresExtensions</PackageId>
<Title>Pandatech.EFCore.PostgresExtensions</Title>
<PackageTags>Pandatech, library, EntityFrameworkCore, PostgreSQL, For Update, Lock, LockingSyntax</PackageTags>
<Description>The Pandatech.EFCore.PostgresExtensions library enriches Entity Framework Core applications with advanced PostgreSQL functionalities, starting with the ForUpdate locking syntax. Designed for seamless integration, this NuGet package aims to enhance the efficiency and capabilities of EF Core models when working with PostgreSQL, with the potential for further PostgreSQL-specific extensions.</Description>
<PackageTags>Pandatech, library, EntityFrameworkCore, PostgreSQL, For Update, Lock, LockingSyntax, Bulk insert, BinaryCopy</PackageTags>
<Description>The Pandatech.EFCore.PostgresExtensions library enriches Entity Framework Core applications with advanced PostgreSQL functionalities, starting with the ForUpdate locking syntax and BulkInsert function. Designed for seamless integration, this NuGet package aims to enhance the efficiency and capabilities of EF Core models when working with PostgreSQL, with the potential for further PostgreSQL-specific extensions.</Description>
<RepositoryUrl>https://github.com/PandaTechAM/be-lib-efcore-postgres-extensions</RepositoryUrl>
<PackageReleaseNotes>InitialCommit</PackageReleaseNotes>
<PackageReleaseNotes>Npgsql copy feature</PackageReleaseNotes>
</PropertyGroup>

<ItemGroup>
<None Include="..\..\pandatech.png" Pack="true" PackagePath="\" />
<None Include="..\..\Readme.md" Pack="true" PackagePath="\" />
<None Include="..\..\pandatech.png" Pack="true" PackagePath="\"/>
<None Include="..\..\Readme.md" Pack="true" PackagePath="\"/>
</ItemGroup>

<ItemGroup>
<PackageReference Include="Microsoft.EntityFrameworkCore.Relational" Version="8.0.3" />
<PackageReference Include="Npgsql.EntityFrameworkCore.PostgreSQL" Version="8.0.2"/>
</ItemGroup>

</Project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
using System.Collections;
using System.Diagnostics;
using System.Reflection;
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Metadata;
using Microsoft.Extensions.Logging;
using Npgsql;
using Npgsql.EntityFrameworkCore.PostgreSQL.Storage.Internal.Mapping;

namespace EFCore.PostgresExtensions.Extensions.BulkInsertExtension;

public static class BulkInsertExtension
{
public static ILogger? Logger { get; set; }

public static async Task BulkInsertAsync<T>(this DbSet<T> dbSet, List<T> entities,
bool pkGeneratedByDb = true) where T : class
{
var context = PrepareBulkInsertOperation(dbSet, entities, pkGeneratedByDb, out var sp, out var properties,
out var columnCount, out var sql, out var propertyInfos, out var propertyTypes);

var connection = new NpgsqlConnection(context.Database.GetConnectionString());
await connection.OpenAsync();

await using var writer = await connection.BeginBinaryImportAsync(sql);

for (var entity = 0; entity < entities.Count; entity++)
{
var item = entities[entity];
var values = propertyInfos.Select(property => property!.GetValue(item)).ToList();

ConvertEnumValue<T>(columnCount, propertyTypes, properties, values);

await writer.StartRowAsync();

for (var i = 0; i < columnCount; i++)
{
await writer.WriteAsync(values[i]);
}
}

await writer.CompleteAsync();
await connection.CloseAsync();
sp.Stop();

Logger?.LogInformation("Binary copy completed successfully. Total time: {Milliseconds} ms",
sp.ElapsedMilliseconds);
}

public static void BulkInsert<T>(this DbSet<T> dbSet, List<T> entities,
bool pkGeneratedByDb = true) where T : class
{
var context = PrepareBulkInsertOperation(dbSet, entities, pkGeneratedByDb, out var sp, out var properties,
out var columnCount, out var sql, out var propertyInfos, out var propertyTypes);

var connection = new NpgsqlConnection(context.Database.GetConnectionString());
connection.Open();

using var writer = connection.BeginBinaryImport(sql);

for (var entity = 0; entity < entities.Count; entity++)
{
var item = entities[entity];
var values = propertyInfos.Select(property => property!.GetValue(item)).ToList();

ConvertEnumValue<T>(columnCount, propertyTypes, properties, values);

writer.StartRow();

for (var i = 0; i < columnCount; i++)
{
writer.Write(values[i]);
}
}

writer.Complete();
connection.Close();
sp.Stop();

Logger?.LogInformation("Binary copy completed successfully. Total time: {Milliseconds} ms",
sp.ElapsedMilliseconds);
}

private static void ConvertEnumValue<T>(int columnCount, IReadOnlyList<Type> propertyTypes,
IReadOnlyList<IProperty> properties, IList<object?> values) where T : class
{
for (var i = 0; i < columnCount; i++)
{
if (propertyTypes[i].IsEnum)
{
values[i] = Convert.ChangeType(values[i], Enum.GetUnderlyingType(propertyTypes[i]));
continue;
}

// Check for generic types, specifically lists, and ensure the generic type is an enum
if (!propertyTypes[i].IsGenericType || propertyTypes[i].GetGenericTypeDefinition() != typeof(List<>) ||
!propertyTypes[i].GetGenericArguments()[0].IsEnum) continue;

var enumMapping = properties[i].FindTypeMapping();

// Only proceed if the mapping is for an array type, as expected for lists
if (enumMapping is not NpgsqlArrayTypeMapping) continue;

var list = (IList)values[i]!;
var underlyingType = Enum.GetUnderlyingType(propertyTypes[i].GetGenericArguments()[0]);

var convertedList = (from object item in list select Convert.ChangeType(item, underlyingType)).ToList();
values[i] = convertedList;
}
}


private static DbContext PrepareBulkInsertOperation<T>(DbSet<T> dbSet, List<T> entities, bool pkGeneratedByDb,
out Stopwatch sp, out List<IProperty> properties, out int columnCount, out string sql,
out List<PropertyInfo?> propertyInfos, out List<Type> propertyTypes) where T : class
{
sp = Stopwatch.StartNew();
var context = dbSet.GetDbContext();


if (entities == null || entities.Count == 0)
throw new ArgumentException("The model list cannot be null or empty.");

if (context == null) throw new ArgumentNullException(nameof(context), "The DbContext instance cannot be null.");


var entityType = context.Model.FindEntityType(typeof(T))! ??
throw new InvalidOperationException("Entity type not found.");

var tableName = entityType.GetTableName() ??
throw new InvalidOperationException("Table name is null or empty.");

properties = entityType.GetProperties().ToList();

if (pkGeneratedByDb)
properties = properties.Where(x => !x.IsKey()).ToList();

var columnNames = properties.Select(x => $"\"{x.GetColumnName()}\"").ToList();

if (columnNames.Count == 0)
throw new InvalidOperationException("Column names are null or empty.");


columnCount = columnNames.Count;
var rowCount = entities.Count;

Logger?.LogDebug(
"Column names found successfully. \n Total column count: {ColumnCount} \n Total row count: {RowCount}",
columnCount, rowCount);

sql = $"COPY \"{tableName}\" ({string.Join(", ", columnNames)}) FROM STDIN (FORMAT BINARY)";

Logger?.LogInformation("SQL query created successfully. Sql query: {Sql}", sql);

propertyInfos = properties.Select(x => x.PropertyInfo).ToList();
propertyTypes = propertyInfos.Select(x => x!.PropertyType).ToList();
return context;
}
}
15 changes: 15 additions & 0 deletions src/EFCore.PostgresExtensions/Extensions/DbSetExtensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Infrastructure;

namespace EFCore.PostgresExtensions.Extensions;

public static class DbSetExtensions
{
public static DbContext GetDbContext<T>(this DbSet<T> dbSet) where T : class
{
var infrastructure = dbSet as IInfrastructure<IServiceProvider>;
var serviceProvider = infrastructure.Instance;
var currentDbContext = serviceProvider.GetService(typeof(ICurrentDbContext)) as ICurrentDbContext;
return currentDbContext.Context;

Check warning on line 13 in src/EFCore.PostgresExtensions/Extensions/DbSetExtensions.cs

View workflow job for this annotation

GitHub Actions / deploy

Dereference of a possibly null reference.

Check warning on line 13 in src/EFCore.PostgresExtensions/Extensions/DbSetExtensions.cs

View workflow job for this annotation

GitHub Actions / deploy

Dereference of a possibly null reference.

Check warning on line 13 in src/EFCore.PostgresExtensions/Extensions/DbSetExtensions.cs

View workflow job for this annotation

GitHub Actions / deploy

Dereference of a possibly null reference.
}
}
Loading

0 comments on commit 54572c6

Please sign in to comment.