Skip to content

Commit

Permalink
feat: add proper torch caching
Browse files Browse the repository at this point in the history
changed Torch autocast inference to be only used for CPU
changed how executors determine their device types
  • Loading branch information
Marie Dev Bot committed Nov 27, 2023
1 parent 120ba6a commit 6a75565
Show file tree
Hide file tree
Showing 18 changed files with 475 additions and 194 deletions.
66 changes: 62 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,32 @@
# Contribute to MarieAI(🦊)

Thanks for your interest in contributing to MarieAi. We're grateful for your initiative! ❤️

In this guide, we're going to go through the steps for each kind of contribution, and good and bad examples of what to do. We look forward to your contributions!


<a name="-bugs-and-issues"></a>
## 🐞 Bugs and Issues

### Submitting Issues

We love to get issue reports. But we love it even more if they're in the right format. For any bugs you encounter, we need you to:

* **Describe your problem**: What exactly is the bug. Be as clear and concise as possible
* **Why do you think it's happening?** If you have any insight, here's where to share it

There are also a couple of nice to haves:

* **Environment:** You can find this with ``marie -vf``
* **Screenshots:** If they're relevant

# Coding standards

To ensure the readability of our code, we stick to a few conventions:

* We format python files using black.
* For linting, we use flake8.
* For sorting imports, we use isort.
* We format python files using `black`.
* For linting, we use `flake81`.
* For sorting imports, we use `isort`.


The `setup.cfg` and `pyproject.toml` already contain the proper configuration for these tools.
Expand All @@ -20,4 +40,42 @@ You can install it using
```shell
pip install pre-commit
pre-commit install
```
```

<a name="-naming-conventions"></a>
## ☑️ Naming Conventions

For branches, commits, and PRs we follow some basic naming conventions:

* Be descriptive
* Use all lower-case
* Limit punctuation
* Include one of our specified [types](#specify-the-correct-types)
* Short (under 70 characters is best)
* In general, follow the [Conventional Commit](https://www.conventionalcommits.org/en/v1.0.0/#summary) guidelines

Note: If you don't follow naming conventions, your commit will be automatically flagged to be fixed.

### Specify the correct types

Type is an important prefix in PR, commit message. For each branch, commit, or PR, we need you to specify the type to help us keep things organized. For example,

```
feat: add hat wobble
^--^ ^------------^
| |
| +-> Summary in present tense.
|
+-------> Type: build, ci, chore, docs, feat, fix, refactor, style, or test.
```

- build: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
- ci: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
- docs: Documentation only changes
- feat: A new feature
- fix: A bug fix
- perf: A code change that improves performance
- refactor: A code change that neither fixes a bug nor adds a feature
- style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.)
- test: Adding missing tests or correcting existing tests
- chore: updating grunt tasks etc; no production code change
66 changes: 63 additions & 3 deletions docs/docs/getting-started/contributing/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,33 @@ sidebar_position: 1

# Contribute to MarieAI(🦊)

Thanks for your interest in contributing to MarieAi. We're grateful for your initiative! ❤️

In this guide, we're going to go through the steps for each kind of contribution, and good and bad examples of what to do. We look forward to your contributions!


<a name="-bugs-and-issues"></a>
## 🐞 Bugs and Issues

### Submitting Issues

We love to get issue reports. But we love it even more if they're in the right format. For any bugs you encounter, we need you to:

* **Describe your problem**: What exactly is the bug. Be as clear and concise as possible
* **Why do you think it's happening?** If you have any insight, here's where to share it

There are also a couple of nice to haves:

* **Environment:** You can find this with ``marie -vf``
* **Screenshots:** If they're relevant

# Coding standards

To ensure the readability of our code, we stick to a few conventions:

* We format python files using black.
* For linting, we use flake8.
* For sorting imports, we use isort.
* We format python files using `black`.
* For linting, we use `flake81`.
* For sorting imports, we use `isort`.


The `setup.cfg` and `pyproject.toml` already contain the proper configuration for these tools.
Expand All @@ -26,6 +46,46 @@ pip install pre-commit
pre-commit install
```

<a name="-naming-conventions"></a>
## ☑️ Naming Conventions

For branches, commits, and PRs we follow some basic naming conventions:

* Be descriptive
* Use all lower-case
* Limit punctuation
* Include one of our specified [types](#specify-the-correct-types)
* Short (under 70 characters is best)
* In general, follow the [Conventional Commit](https://www.conventionalcommits.org/en/v1.0.0/#summary) guidelines

Note: If you don't follow naming conventions, your commit will be automatically flagged to be fixed.

### Specify the correct types

Type is an important prefix in PR, commit message. For each branch, commit, or PR, we need you to specify the type to help us keep things organized. For example,

```
feat: add hat wobble
^--^ ^------------^
| |
| +-> Summary in present tense.
|
+-------> Type: build, ci, chore, docs, feat, fix, refactor, style, or test.
```

- build: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
- ci: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
- docs: Documentation only changes
- feat: A new feature
- fix: A bug fix
- perf: A code change that improves performance
- refactor: A code change that neither fixes a bug nor adds a feature
- style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc.)
- test: Adding missing tests or correcting existing tests
- chore: updating grunt tasks etc; no production code change



## Downloading large artifacts

Often you will find that for the executor to work, a large file needs to be downloaded first - usually this would be a file with pre-trained model weights. If this is done at the start of the executor, it will lead to really long startup times, or even timeouts, which will frustrate users.
Expand Down
151 changes: 79 additions & 72 deletions marie/document/trocr_ocr_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
from marie.document.ocr_processor import OcrProcessor
from marie.lang import Object
from marie.logging.predefined import default_logger
from marie.logging.profile import TimeContext
from marie.logging.profile import TimeContext, TimeContextCuda
from marie.models.icr.memory_dataset import MemoryDataset

# required to register text_recognition
Expand Down Expand Up @@ -76,32 +76,24 @@ def init(model_path, beam=5, device="") -> Tuple[Any, Any, Any, Any, Any, Compos
else:
model[0] = model[0].to(device)

# Process time 80.9748 seconds : With compile - default
# Process time 85.9748 seconds : With no compile

# try to compile the model with torch.compile - cudagraphs
if device == 'cuda':
try:
# Optimize model for Inference time
with TimeContext("Compiling TROCR model", logger=logger):
import torch._dynamo as dynamo

model[0] = torch.compile(model[0])
if False:
model[0] = torch.compile(
model[0],
# mode="max-autotune",
fullgraph=True,
backend="inductor",
options={"max_autotune": True, "triton.cudagraphs": True},
)
except Exception as e:
logger.error(f"Failed to compile model : {e}")
try:
with TimeContext("Compiling TROCR model", logger=logger):
model[0] = torch.compile(model[0], mode="max-autotune", dynamic=True)
if False:
model[0] = torch.compile(
model[0],
fullgraph=True,
dynamic=True,
backend="inductor",
options={"max_autotune": True, "triton.cudagraphs": True},
)
except Exception as e:
logger.error(f"Failed to compile model : {e}")

img_transform = transforms.Compose(
[
transforms.Resize((384, 384), interpolation=InterpolationMode.BICUBIC),
# transforms.Resize((384, 384), interpolation=InterpolationMode.LANCZOS),
transforms.ToTensor(),
transforms.Normalize(0.5, 0.5),
]
Expand All @@ -120,7 +112,6 @@ def init(model_path, beam=5, device="") -> Tuple[Any, Any, Any, Any, Any, Compos
return model, cfg, inference_task, generator, bpe, img_transform, device


# @Timer(text="preprocess_image in {:.4f} seconds")
def preprocess_image(image, img_transform, device):
im = image.convert("RGB").resize((384, 384), Image.BICUBIC)
# this causes an error when batching due to the shape in deit.py
Expand All @@ -133,7 +124,6 @@ def preprocess_image(image, img_transform, device):
return im


# @Timer(text="preprocess_samples in {:.4f} seconds")
def preprocess_samples(src_images, img_transform, device):
images = []
for image in src_images:
Expand All @@ -148,7 +138,6 @@ def preprocess_samples(src_images, img_transform, device):
return sample


# @Timer(text="Text in {:.4f} seconds")
def get_text(cfg, task, generator, model, samples, bpe):
predictions = []
scores = []
Expand Down Expand Up @@ -197,7 +186,6 @@ def get_text(cfg, task, generator, model, samples, bpe):

class TrOcrProcessor(OcrProcessor):
MODEL_SPEC = TrOcrModelSpec(None, None, None, None, None, None, None)
INITIALIZED = False

def __init__(
self,
Expand All @@ -209,7 +197,6 @@ def __init__(
super().__init__(work_dir, cuda, **kwargs)
model_path = os.path.join(models_dir, "trocr-large-printed.pt")
logger.info(f"TROCR ICR processor [cuda={cuda}] : {model_path}")

if not os.path.exists(model_path):
raise Exception(f"File not found : {model_path}")

Expand All @@ -221,8 +208,7 @@ def __init__(
device = "cuda" if cuda else "cpu"

start = time.time()
# beam = 5
beam = 1
beam = 1 # default beam size is 5
(
model,
cfg,
Expand Down Expand Up @@ -293,51 +279,72 @@ def __recognize_from_fragments(
start = time.time()
model_spec: TrOcrModelSpec = self.MODEL_SPEC

with torch.amp.autocast(
device_type=model_spec.device, enabled=True, cache_enabled=True
# not seeing any better performance with amp on GPU, we are actually having about 15% performance hit
amp_enabled = False
if self.device == "cpu":
amp_enabled = True

def log_cuda_time(cuda_time):
logger.warning(f"CUDA : {cuda_time}")
# write to the text file
with open("/tmp/cuda_time_autocast_compiled.txt", "a") as f:
f.write(f"{cuda_time}, {len(src_images)}, {amp_enabled}\n")

with TimeContextCuda(
"TrOcr inference", logger=logger, enabled=False, callback=log_cuda_time
):
for i, batch in enumerate(batchify(src_images, batch_size)):
eval_data = MemoryDataset(images=batch, opt=opt)
batch_start = time.time()

images = [img for img, img_name in eval_data]
samples = preprocess_samples(
images, model_spec.img_transform, model_spec.device
)
predictions, scores = get_text(
model_spec.cfg,
model_spec.task,
model_spec.generator,
model_spec.model,
samples,
model_spec.bpe,
)

for k in range(len(predictions)):
text = predictions[k]
# TODO: make this configurable as an option. Different models can return different cases
text = text.upper() if text is not None else ""
score = scores[k]
_, img_name = eval_data[k]
confidence = round(score, 4)
row = {"confidence": confidence, "id": img_name, "text": text}
results.append(row)
logger.debug(f"results : {row}")

logger.info(
"Batch time [%s, %s]: %s"
% (i, len(batch), time.time() - batch_start)
)

del scores
del predictions
del images
del samples
del eval_data

torch_gc()
logger.info("ICR Time elapsed: %s" % (time.time() - start))
with torch.amp.autocast(
device_type=model_spec.device,
enabled=amp_enabled,
cache_enabled=True,
):
for i, batch in enumerate(batchify(src_images, batch_size)):
eval_data = MemoryDataset(images=batch, opt=opt)
batch_start = time.time()

images = [img for img, img_name in eval_data]
samples = preprocess_samples(
images, model_spec.img_transform, model_spec.device
)
predictions, scores = get_text(
model_spec.cfg,
model_spec.task,
model_spec.generator,
model_spec.model,
samples,
model_spec.bpe,
)

for k in range(len(predictions)):
text = predictions[k]
# TODO: make this configurable as an option. Different models can return different cases
text = text.upper() if text is not None else ""
score = scores[k]
_, img_name = eval_data[k]
confidence = round(score, 4)
row = {
"confidence": confidence,
"id": img_name,
"text": text,
}
results.append(row)
logger.debug(f"results : {row}")

logger.info(
"Batch time [%s, %s]: %s"
% (i, len(batch), time.time() - batch_start)
)

del scores
del predictions
del images
del samples
del eval_data

logger.info("ICR Time elapsed: %s" % (time.time() - start))
except Exception as ex:
print(traceback.format_exc())
raise ex
finally:
torch_gc()
return results
Loading

0 comments on commit 6a75565

Please sign in to comment.