-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
markdown file xgettext + gettext #92
base: main
Are you sure you want to change the base?
Conversation
|
||
let (input, content) = | ||
ingest_input(matches.get_one::<String>("input_dir").map(|d| d.as_str()))?; | ||
let lang_option = matches.get_one::<String>("language").unwrap().as_str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make --lang
optional? Simply use an empty string if it is not given. Then people can add a language by hand later in the generated POT file (if they like).
.arg(Arg::new("input_dir").short('f').long("file")) | ||
.arg(Arg::new("language").short('l').long("lang").required(true)) | ||
.arg(Arg::new("pot_title").short('t').long("title")) | ||
.arg(Arg::new("output_dir").short('o').long("out")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please align the options with xgettext
from the Gettext utilities. We should only use a few options, I think something like
-o
/--output
: output file, if not given, we usestdout
.inputfile
: this is not a flag (no-
or--
), this is simply the positional arguennts given after the flags. If none are given, we usestdin
.
Actually, it seems like xgettext
does not set a title
or a lang
in the header entry (see Filling in the Header Entry). So I would suggest leaving them blank as well: less code for us to maintain! 😄
Ok(catalog) | ||
} | ||
|
||
fn ingest_input(input_dir: Option<&str>) -> anyhow::Result<(&str, String)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make this function do the following:
- read files given explicitly on the command line (even if they don't end in
.md
) - recursively read files from directories given on the command line (filtering for files matching
*.md
: https://docs.rs/glob/ can help with this) - if a file named
-
is given, or if there are no files listed, then read fromstdin
.
I think the function is almost doing this already, only the handling of multiple inputs and recursive reading of directories is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the writeup for xgettext
. (I'll put this in a readme once I get the green light @mgeisler)
md-extract
This tool extracts translatable text from a given Markdown file or stdin and outputs it to a .po file or stdout.
Usage
md-extract [inputfile] -o/--output
Arguments
inputfile
- a specified file path
- if not provided, the program reads from stdin
Options
-o/--output
- specifies the output file name
- if not provided, the program writes to stdout
Notes:
- for simplicity sake, this tool doesn't handle multiple inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgeisler do you like this? Happy to implement once I get your approval :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a great start, thanks!
Looking at this again, I think we'll eventually want to shorten and simplify the names here. I'm okay with mdbook-xgettext
and mdbook-gettext
as plugins for mdbook
, but these programs will be run by users on a command line. So they should probably be shorter and more clear. Perhaps md-extract
and md-translate
? We can move things around after we merge the PR, so don't worry about this too much right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense -- will update my files accordingly
} | ||
} | ||
|
||
fn build_path(output_dir: Option<&str>, pot_title: &str) -> anyhow::Result<PathBuf> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like this function to be simpler:
- if
output
is"-"
, then write to stdout - otherwise write to the path given as output. People can create the necessary directories themselves.
Ok(path) | ||
} | ||
|
||
fn main() -> anyhow::Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit in doubt about the best command line interface for this...
One idea would be to allow
markdown-gettext xx.po foo.md -o foo-xx.md
which would translate foo.md
using xx.po
and write the output to foo-xx.md
.
Here, foo.md
could be left out or it could be -
and then we read from stdin
(like for markdown-xgettext
below). Similar, -o
can be left out or be -
and then we write to stdout
.
If there are multiple inputs, then I guess the user cannot write to stdout
and must instead give an output directory:
markdown-gettext xx.po foo.md bar.md --output-dir xx
In this case it would be an error to attempt to read from stdin
since we don't really know what to name the output file in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding correctly, we want optional flags for output file title and directory.
Since markdown-gettext xx.po foo.md -o foo-xx.md
, does this mean if one was to add an additional wef.md, would the command be: markdown-gettext xx.po foo.md wef.md -o foo-xx.md wef-xx.md
?
Output directory makes sense to me 👍. I would just like some clarification on file naming conventions, especially when one wants to translate multiple files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding correctly, we want optional flags for output file title and directory.
Yeah, I'm trying to figure out how we can model the behavior here to be similar to existing Unix commands.
We should also keep it simple! Less code for all of us to write and maintain 😄
I'm thinking of using cp
as a basis:
cp
can copy a single path to a new name orcp
can copy multiple path into a single directory
The paths which are copied can be files or directories (or a mix of both). If there is a directory, then you must add -r
to do a recursive copy.
I now see
markdown-gettext xx.po source.md dest.md
as
cp source.md dest.md
but with the translation happening as well.
Similarly,
markdown-gettext xx.po foo.md bar.md dest/
as
cp foo.md bar.md dest/
Finally, the rule out -r
should be implemented: if you want to recursively translate a directory, you have to add a -r
flag.
This new thinking is different from what I had before where I thought we should use the flags from the GNU Gettext utilities. Back then, I hadn't realized that there isn't a command line program which mirrors what our markdown-gettext
program will do 😄
Note that these are just ideas, please let me know if you have better ones!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgeisler I'll write up a doc that outlines what the CLI should do for both xgettext
and gettext
. Once we get everything to your liking, I'll change the binaries to follow those requirements!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great! That document could simultaneously become the manual for the tools, so your work here will be reused right away.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the writeup for gettext
markdown-gettext
This tool copies source paths to destination paths while performing translations.
Usage
For a single file translation:
md-gettext lang.po source.md dest.md
note for single file translations:
- if
source.md
is not specified, the program reads fromstdin
- if
dest.md
is not specified, the program outputs tostdout
For multiple paths:
md-gettext lang.po foo.md bar.txt wef.md dest/
For directories, you must use the -r
flag for a recursive translation:
md-gettext lang.po sourcedir/ destdir/ -r
Arguments
-
xx.po
- the translation file in
.po
format
- the translation file in
-
source.md foo.md bar.txt sourcedir/
etc...- source paths which can be a mix of files and directories
-
dest.md, dest/
- destination paths
Options
-r
- enables recursive translation for directories
- must be added if you want to translate contents of a directory and its subdirectories
Please let me know what you think! @mgeisler @djmitche. I tried to be as simple as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @friendlymatthew, sorry for the slow reply. The above looks great, thank you very much for writing it up! I think it follow the behavior from the cp
manual page closely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries @mgeisler, I appreciate the opportunity to work on this 😁.
When I have some time off from official work, I'll get on this ASAP
fixes: #68
markdown-gettext binary
You can execute this binary by the following:
or
* Note: --dir or -f must be specified, -po is always required, and --out defaults to '/translated_md_files`
Suppose in
comprehensive-rust
I want to translate all md files in theconcurrency
section to japanese:❯ markdown-gettext --dir=src/concurrency --po=po/ja.po --out=concurrency_ja_md_files ❯ cd concurrency_ja_md_files ❯ ls channels.md scoped-threads.md send-sync.md shared_state.md threads.md
threads.md
__
markdown-xgettext binary
You can execute this binary by the following:
or
* Note: If --lang is not specified, the .pot default to English.
Example command:
output.po