Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError #4

Open
2718455213wcx opened this issue Nov 14, 2023 · 2 comments
Open

MemoryError #4

2718455213wcx opened this issue Nov 14, 2023 · 2 comments

Comments

@2718455213wcx
Copy link

I used drian in lopai to parse the thunderbird dataset (29.8gb) without getting MemoryError, but I did get MemoryError when I parsed the split thunderbird dataset (2.92gb) using Brain in logpai. When I parse a 1000m thunderbird dataset there is no MemoryErro.Why is that? Can Brain only parse data sets around 1gb in size?
Traceback (most recent call last):
File "E:\logbert-main\TBird\data_process.py", line 137, in
parse_log(data_dir, output_dir, log_file, parser_type)
File "E:\logbert-main\TBird\data_process.py", line 77, in parse_log
parser.parse(log_file)
File "E:\logbert-main\TBird..\logparser\Brain.py", line 58, in parse
group_len, tuple_vector, frequency_vector = self.get_frequecy_vector(
File "E:\logbert-main\TBird..\logparser\Brain.py", line 261, in get_frequecy_vector
set.setdefault(str(lenth), []).append(token)
MemoryError

@gaiusyu
Copy link
Owner

gaiusyu commented Nov 14, 2023

You can try splitting the data set into small enough chunks until you don't get any memory errors. If your PC has more memory, Brain will be able to parse larger data sets. Maybe I will improve Brain to save more memory overhead in the future😂

@a13382735176
Copy link

thank you man!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants