Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support TransformerCli 'skip' option #87

Open
zanerock opened this issue Apr 9, 2019 · 3 comments
Open

Support TransformerCli 'skip' option #87

zanerock opened this issue Apr 9, 2019 · 3 comments
Labels

Comments

@zanerock
Copy link

zanerock commented Apr 9, 2019

The --skip option in the TransformerCli is documented at the head, but later on it's noted that it's not actually implemented. The option is useful or even necessary for processing large data sets.

@zanerock
Copy link
Author

zanerock commented Apr 9, 2019

I haven't dug into this, but I suspect the reason is that Transformer is using ZipInputStream, which is inherently linear. Switching to or optionally using using ZipFile may be a relatively easy and efficient solution. Regardless, I'd may be able to take up the issue.

@bgfeldm
Copy link
Contributor

bgfeldm commented Apr 9, 2019

TransformerCli was deprecated for the newer one which supports skip at gov.uspto.bulkdata.cli.Transformer

@bgfeldm bgfeldm added the wontfix label Apr 9, 2019
@zanerock
Copy link
Author

Gotcha, good to hear. Could the deprecated class be deleted?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants