-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The visual tries to read data from the data source - Negative impact on large data sources in Direct Query #21
Comments
Could someone, maybe @stopyoukid, take a look at this. I tried to fix this myself, but the only thing I manged to do was to decrese the number of rows to the visual and by doing that, my Power BI report no longer generates a warning in Power BI Service. But there's a need for another change as well. With the current implementation my filter visual takes ~20 seconds to update since I'm doing Direct Query. The visual generates approximate this code: As you could imaging, Doing a GroupBy, OrderBy on a NVARCHAR(4000) with >1 Million rows, and then a TOPN on-top of that takes a while. The TOPN only protexts PowerBI from getting to much data, but does not effect the amount of data that needs to be grouped and ordered. Since I don't see any reason for this kind of filter to read data, couldn't we just remove the feature that read data? |
I totally agree, TextFilter doesn't need data at all, just a reference to the column it filters. However, it seems this is a limitation of the custom visual API. I don't know of another way of telling the Power BI API that we need the field only and no data. In regards to your PR #26, we've gotten conflicting information (my team I mean). It does seem in some instances, maybe depending on how data is connected up to the workbook, that the In regards to the perf issue, I pushed some updates to that
When you use it, try setting your field in Power BI to be either "Last " or "First " like below, maybe this will help. |
Thanks a lot @stopyoukid. I used Count instead of First/Last since First/Last generated a Min/Max in T-SQL. Imaging doing Min/Max on millions of NVARCHAR(4000) rows :-) With the limitations you mention, I guess this is the best we can do. Something I noticed:
This is the T-SQL code generated when not using aggregates: So what I would suggest (if possible)
|
It also looks like some earlier problems I hade like thouse described in issue #19 are gone now. For me these problems didn't come after some days, they did come now-and-then, whenever change filtered columns or duplicated the filter. |
As I revisit this and see nothing has happened, I place some comments. |
Was this ever released? We are trying to use this feature, but don't see the First / Last etc |
Opus, yes this one was long ago. As I remember I downloaded something not on the main branch. If that doesn't help, the implicit measures like last, count etc are not available when using calculation groups so if you are using that you will not see then. Alternative, write your own explicit measure. @stopyoukid Any news in this, as I remembered there was things that could be done with not that much effort |
I tried this filter using direct query. I haven't used it for in memory models, but for direct queries I see no point in reading data from the data source, this just put to much load on the data source.
My suggestion is to implement a settings property where developers can choose to not read data from the data source.
I get this error message while using this visual on data sources that has large amount of data in the text column
-->
Couldn't load the data for this visual
Couldn't retrieve the data for this visual. Please try again later.
Please try again later or contact support. If you contact support, please provide these details.
Activity ID: 5309b3b7-9142-473e-93ff-8b4dbb7abcd0
Request ID: c79958ce-841f-4b21-0792-502916286a85
Correlation ID: dd4f7fe6-468b-64e4-e280-dfa66d3c8e4f
Time: Thu Nov 12 2020 08:36:27 GMT+0100 (Central European Standard Time)
Service version: 13.0.14696.63
Client version: 2011.1.03669-train
Cluster URI: https://wabi-north-europe-redirect.analysis.windows.net/
<--
The error message is not due to a large number of rows, since using this visual on other columns in the same table doesn't generate this error. This error messages comes out of that this column has a large amount of data.
Yes I know that Full Text Index is not supported and using this filter on large text fields and tables will be slow, but the customers/users are willing to accept this being slow, but not that an error message/varning is generated.
By some reason, I see this error more frequently in my P5 Premium tenant than in my Power BI Desktop
Checking the generated DAX-code, there is a limit in the number of rows retrieved:
-->
DEFINE
VAR __DS0Core =
DISTINCT('Test Tickets'[Description])
VAR __DS0BodyLimited =
TOPN(30002, __DS0Core, 'Test Tickets'[Description], 1)
EVALUATE
__DS0BodyLimited
ORDER BY
'Test Tickets'[Description]
<--
But I guess that the combination of 30002 rows of NVARCHAR(4000)) in some situations is just to much for Power BI to handle, see the SQL code below:
-->
SELECT
TOP (30002) [t10].[Description]
FROM
(
SELECT
CAST([Description] AS NVARCHAR(4000))
...
<--
The text was updated successfully, but these errors were encountered: