Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer error when reaching 2GB (2.147.869.560) byteLength #116

Open
jimver04 opened this issue Jan 12, 2023 · 18 comments
Open

Buffer error when reaching 2GB (2.147.869.560) byteLength #116

jimver04 opened this issue Jan 12, 2023 · 18 comments
Assignees
Labels
bug Something isn't working

Comments

@jimver04
Copy link

Hi,

my data to export are very big, and GLTFSDK raises an exception.

totalbyteLength seems not to go beyond 2.147.869.560 bytes.
Do you know the maximum of the offset that ResourceWriter can write ?
How can I extend it ?

L86 @ ResourceWriter.cpp

image

L34 @ StreamUtils.cpp

image

I noticed that Buffer and BufferView always have 1 item when exporting to glb.
Shouldn't be more reasonable to have multiple BufferViews and Buffers so as not to reach exceptions?

Best,
Dimitrios

@bghgary
Copy link
Contributor

bghgary commented Jan 13, 2023

See here: KhronosGroup/glTF#2114

@jimver04
Copy link
Author

jimver04 commented Mar 3, 2023

I don't think it is related to #2114.

It is related to the limit of std::ostream to 2GB because we are flushing data in the end of all processes:

image

I think we should flush more frequently to the file in order to avoid the 2GB limit of std::ostream.

It has also to do that we have only
1 Buffer
and
1 BufferView

@bghgary
Copy link
Contributor

bghgary commented Mar 3, 2023

I don't think it is related to #2114.

The code issue is not related, but the GLB file format itself cannot be bigger than 2GB because the header uses a uint32 for the length. Even if we fix the code issue, it still won't work.

EDIT: Or are you writing to a glTF and not a GLB?

@jimver04
Copy link
Author

jimver04 commented Dec 4, 2023

Hi,

I am trying to export a 2.8GB CAE model in a single GLB file. So far, I have managed to export models of maximum size of int32, namely 2GB.

The total file length limit and the chunck1 (binary part) length limit are both uint32. See below :
image

The uint32 maximum is 4GB: 0xFFFFFFFF (or 4294967295 in decimal)
image

So I assume that writing a 3.8 GB binary buffer with a 0.2 GB json is possible as regards GLB standards.

The crash in GLTF-SDK happens during addAccessor - more specifically in StreamUtils::WriteBinary(std::ostream& stream, const T* data, size_t size). It has the following callstack:

image

image

image

image

I have made a function that calculates the number of bytes (N) needed for my model a priori - without any bufferBuilder - by iterating all my models parts and summing up the nbytes for indices, nbytes for vertices, nbytes for morphing, and nbytes for vertex color for the last simulation frame.

Somehow, I should find a way to pre-allocate N bytes in ostream, so that StreamUtils::WriteBinary(std::ostream& stream, const T* data, size_t size) does not crash when the size of stream exceeds 2GBs.

Best,
Dimitrios

@jimver04
Copy link
Author

jimver04 commented Dec 4, 2023

In my PC the streamsize does not seem to suffer the 2GB limit
image

@bghgary
Copy link
Contributor

bghgary commented Dec 5, 2023

Ahh, sorry, I'm dumb and I'm not thinking about unsigned. Well, that seems like a bug then. Do you have an easy repro I can use to test the code?

@jimver04
Copy link
Author

jimver04 commented Dec 5, 2023

Unfortunately, I can not share the CAE models, and it is difficult to reproduce without a very big model. I am copying some screenshots with information below. It is seen that if the limit of 2GB is reached while adding accessors through bufferBuilder, then the size of the ostream (in this case it is stream in ram) becomes -1, and failbit and badbit become true.
See information for failbit and badbit in here: https://cplusplus.com/reference/ios/ios/rdstate/
image

@jimver04
Copy link
Author

jimver04 commented Dec 5, 2023

Hi, I have found this below but not tested yet as my project is compiled with make and I do not know how to insert this option.

In order in C++ to use objects more than 2GB you should set /LARGEADDRESSAWARE in Linker properties in
Configuration Properties > Linker > System property page. See these resource for more details:
[1] https://stackoverflow.com/questions/37413998/why-my-program-does-not-take-more-than-2-gb-ram-on-64-gb-ram-system
[2] https://learn.microsoft.com/en-us/cpp/build/reference/largeaddressaware-handle-large-addresses?view=msvc-170
[3] https://stackoverflow.com/questions/3109543/what-to-do-to-make-application-large-address-aware

image

@jimver04
Copy link
Author

jimver04 commented Dec 6, 2023

Hi,
I have created this simple example that replicates the issue. When I run it in x86 mode the limit is 512Mb, whereas in x64 mode the limit is 2 GB. LargeAddressAware option does not affect any mode. The result is the same.

image

I am using stringstream because GLB is using stringstream as found here:
image

Here is my example

#include <iostream>
#include <sstream>

int main() {

    // Create a chunk of data that is 128 Mbs
    int szDataChunk = pow(2, 27);  
    char* data = new char[szDataChunk]; 

    //The number of times to add this chunk to a stream.
    // So, 128 Mbs * 32 = 2^(27+5) = 2^32 = 4 GBs 
    int ntimes = pow(2, 5); 

    std::stringstream* m_stream = new std::stringstream();
    
    for (int i = 0; i < ntimes; i++)
    {
        // Write the data to stream
        m_stream->write(data, szDataChunk); 

        // See the size onscreen
        std::cout << m_stream->tellp() << " bytes | "
                 << m_stream->tellp()/ (1024*1024) << " Mb \n"; 
    }

    return 0;
}

@jimver04
Copy link
Author

jimver04 commented Dec 6, 2023

It seems that it is OS / Compiler related.

@jimver04
Copy link
Author

jimver04 commented Dec 6, 2023

On the other side, there is no strict limit if we use std::string instead of std:stringstream. The example below in Windows demonstrates that std::string can easily reach 16GBs. Moreover std::string seems faster and we can also preallocate the memory needed by calculating a priori the number of bytes per indice and vertex (Stringstream does not allow preallocation). However, many things should change in the GLTF-SDK code ...

image

@bghgary bghgary added the bug Something isn't working label Dec 7, 2023
@bghgary bghgary self-assigned this Dec 7, 2023
@bghgary
Copy link
Contributor

bghgary commented Dec 8, 2023

It looks like Microsoft's STL implementation limits string buffers to INT_MAX.

https://github.com/microsoft/STL/blob/cf1313c39169dc376761eddee23c5e408e01aaa9/stl/inc/sstream#L252-L261

@bghgary
Copy link
Contributor

bghgary commented Dec 8, 2023

Looks like there were similar issues before:
microsoft/STL#578
microsoft/STL#388

@jimver04 Do you mind filing an issue on Microsoft's STL for your test case?

@bghgary
Copy link
Contributor

bghgary commented Dec 8, 2023

@jimver04 I don't know what your code looks like, but GLBResourceWriter has a constructor with a second argument std::unique_ptr<std::iostream> tempBufferStream. If you pass in your own for this (maybe using a local file instead), it will avoid the usage of std::stringstream. Hopefully this is good enough until the STL is fixed.

@jimver04
Copy link
Author

@bghgary It seems that it is an old issue that never got priority in MSVC. In Gcc the limits seem to be based on __string_type::size_type
https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/sstream which is not so explicitly provided. So I guess it goes up to memory available. I have started modifying GLTF-SDK to replace std:stringstream with std::string wherever necessary. Thanks for the information. My changes are send to branch: https://github.com/jimver04/glTF-SDK

@jimver04
Copy link
Author

I have also mentioned the issue of 4GB limitation of GLTF to Khronos group:
KhronosGroup/glTF#1051 (comment)

@bghgary
Copy link
Contributor

bghgary commented Dec 11, 2023

@bghgary It seems that it is an old issue that never got priority in MSVC. In Gcc the limits seem to be based on __string_type::size_type https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/std/sstream which is not so explicitly provided. So I guess it goes up to memory available. I have started modifying GLTF-SDK to replace std:stringstream with std::string wherever necessary. Thanks for the information. My changes are send to branch: https://github.com/jimver04/glTF-SDK

I don't see this as an old issue. As I pointed to earlier, the code is checking against INT_MAX and fails to allocate bigger. The two old issues I pointed to are fixed as far as I can tell. I don't see a reason why std::stringstream should be limited to INT_MAX.

@jimver04
Copy link
Author

I am trying to use std::string instead of std::stringstream. Why do you have caches ? StreamCache, IStreamCache, StreamCacheLRU ? What do you cache ? The Buffer stream or the File output path ? and why ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants