-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How is te-kerende used? #28
Comments
I was the one that brought this up. I don't know how you should word things in the document, but these constructions are not Manding "compounds" in a linguistic sense. Do you want an linguistic explanation of the the "distributive construction" or are you looking for an explanation of the orthographic convention used to represent it? In Latin-based Bambara, there is no special way to mark a distributive construction like there is in the N'ko based tradition. For instance, in the Latin-based tradition:
But in the N'ko orthographic tradition it would be done like this: ߞߏ_ߏ_ߞߏ߫ [NOTE: I also see that your vowel length isn't right on mɔɔ ɔ mɔɔ. You also don't include tonal diacritics. Not sure what your convention for transliterating or interpreting N'ko in Latin-based orthography is in the document.] |
[Just to be clear, the image containing examples above is screen snapped from the Unicode proposal for adding a character to the N'Ko block. The transcriptions are nothing to do with me.] Looks like i did attempt some transcriptions at https://r12a.github.io/scripts/nkoo/nqo.html#word though (which may indeed need to be changed - i don't remember the source of those). In those notes i assume that one is supposed to use U+07FA NKO LAJANYALAN to represent the te-kerende. Any thoughts on that? @donaldsoncd do you have a pointer to an explanation of these kinds of linguistic device are used? Searching doesn't seem to yield anything useful. I think i get the general idea, but is it just used for a handful of words, or can one generate one's own te-kerende linked sequences? |
I will love to hear @donaldsoncd's response. I am commenting on @r12a's question about "... it is just used for a handful of words, or ...?" I think this is a common expression in Mandin languages to express the individuality, repetitiveness, or infinity of related action. For example: su-u-su 'every night' in your posting can also be expressed for soma-a-soma (soma soma) 'every morning', tele-e-tele (tele tele) 'every afternoon', wura-a-wura (wura wura) 'every evening'. Same thing is possible for mo-o-mo 'everyone', ke-e-ke (ke ke) 'every man', moso-o-moso (moso moso) 'every woman'; you get the idea. |
In Bambara and Jula, the distributive construction is built by inserting an
In some varieties of Manding (and in N'ko orthography), the vowel that is
In N'ko orthography the convention is to always write this grammatical construction with vowel harmony option (as well as the appropriate tonal diacritics since they play a role as well) PLUS the te-kerende underscore line. For more details on this construction across Manding varieties, you could consult linguistic reference grammars such as:
|
Very helpful, @donaldsoncd. So would you agree that this is written using U+07FA NKO LAJANYALAN? |
As far as I can tell, the te-kerende was never encoded. The lajanyalan is different since it connects to the letters on both sides. The te-kerende should not connect to the letters. |
@NeilSureshPatel i concur about no encoding for a separate lajanyalan, but i didn't find any rationale, or indication of what should be used. I'll see whether i can get some enlightenment from the Unicode Editorial folks on Thursday. In my examples i've been using lajanyalan surrounded by spaces to create the appearance. |
@r12a I was just submitting an issue on the Noto N'ko repo and I saw this other issue. notofonts/nko#5 Part way down Denis says the following: This seems to suggest that the plan for N'ko is to use standard hyphens that are moved to the baseline. This seems a bit odd though. |
I know nothing about the encoding of this. I just do Latin underscores if I have to write it. |
Debbie Anderson pointed me to a discussion at the UTC in 2016. See point 11 at https://www.unicode.org/L2/L2016/16037-script-rec.pdf.
This may be the source of the comments by @moyogo. I also think it's a bit odd. I took a look at the few resources i have to hand that provide selectable online text and found the following. Wikipedia uses hyphen and lajanyalan on the same page, where the former look like ordinary hyphens (mid height, and no spacing), while the latter (surrounded by spaces) is used for what look like te-kerende. eg. Silabosoona at http://cormand.huma-num.fr/maninkabiblio/periodiques/silabosoona5.pdf also uses lajanyalan for te-kerende, but also for general phrase separators, eg. ߊ߭ߜߊߘߡߝ ߺ ߋ ߺ ߋ߫ߝ߸ߋ߫ߦ |
This certainly is a bit messy. The standard mid height hyphen is used with numbers. This can be seen on page 1 of Silabosoona. ߆߂߁-߇߄-߀߃-߀߀ My guess is that the regular hyphen used in text on Wikipedia is more of a workaround rather than preference. It is intuitive to use it since it used in the Latin orthography, whereas typing spaces around a lajanyalan is less convenient. The use of the lajanyalan with spaces does come with other problems. The lajanyalan is really wide compared to a te-kerende. This is exacerbated by the fact that is has negative side bearings for its normal joining behavior. When you add spaces the extra length becomes exposed. The other problem is that the parts of the lajanyalan that overlap with adjacent letters may not have square edges. This varies by font but there are times the bottoms need to be curved or chamfered so that it doesn't punch though the join between an adjacent letter and its baseline stroke. This can be more extreme if any negative kerning is used. For example: If the edge were squared off the corner makes the join not smooth. It subtle in this example. If one were to apply effects, like outlined text, etc this could become more obvious and problematic. Without separate encoding, I think the best way to handle this is to have an alternate N'ko hyphen that is pushed down to the baseline which is replaced contextually (when nested between or following N'ko letters) in the font via |
Hmm. Another part of the messiness is whether or not other people will be inclined to use the hyphen with the expectation that it will magically change position and shape in the required contexts, or will they (as they seem to be doing) just go for the thing that looks to them as if it's what they want to see on the page (ie. the lajanyalan). I looked at a number of other online resources, and those that contain te-kerende and dashes that separate phrases all use lajanyalan, so it seems it may have already become the de facto way of doing this. I wonder whether it makes sense to do the opposite of what you're suggesting @NeilSureshPatel: ie. to fix the font so that the lajanyalan is the right width and has the right shaping when it appears between spaces. This may be an easier context to detect, given that spaces are, it seems, always present, and the joining behaviour is not relevant if spaces are on either side? |
Ahh, yes good point @r12a. I guess once a workaround gets normalized we kind of have to work with it. I can see what you are suggesting working. The lajanyalan can be narrowed, squared off and have zero or near zero sidebearings. When strung together for justification it should still make a solid line. A related approach is to take advantage of the fact the lajanyalan can have positional forms. Therefore, the isolated form can be more tuned for use as a te-kerende and then the positional forms can have more flexibility in design depending on the font. Spaces will break the shaping and default to the isolated form as you say. A thin space would be ideal over a word space but this can be handled with in a handful of different ways. |
One issue with using lajanyalan surrounded by spaces is that this will tend to allow a line-break to happen either side of it, whereas my understanding is that if a line-break is needed, it should always occur after the te-kerende. In theory, if the preceding space were a non-breaking space, that wouldn't be a problem, but in practice users will inevitably type normal spaces most of the time. |
I brought this up with the Script AdHoc (SAH) Unicode committee and consensus was reached that it is ok to use lajanyalan for te-kerende and certain other hyphen-like uses where the glyph is expected to look like a baseline extension surrounded by spaces. |
Thanks for the update @r12a. I was curious to know where the discussion landed on the matter. What did the SAH say about the line-breaking concern that @jfkthame brought up? I think from a font production standpoint, I would still substitute a lajanyalan nested between spaces with an alternate form just to make it narrower and remove any modeling of the overlapping parts of the stroke. |
@NeilSureshPatel The line-breaking discussion was put off for another day. A proposal would need to be submitted. Personally, i'm not so worried about that – just as with dashes in English, such as the one i just typed, people can use a nbsp if needed. I think the problem of handling line breaks around punctuation that is separated from the preceding text is a lot bigger than just N'Ko (think of dandas, French question marks, Mongolian commas, etc. etc.) and may need a more generalised solution. I think that the proposal to shape the lajanyalan appropriately makes sense. I was planning to raise an issue in the Noto repo – would you prefer to do that? (You're better qualified than me to put the right points.) Btw, i'm about to raise then close a gap report about this in our gap analysis framework, so that we can make the progress visible. |
That makes sense, thanks. I'll take a look at the Noto design again to see if it would need to be adjusted and how. Noto uses very simple connections so it may only need a width adjustment. I'll raise an issue in the repo with the proper recommendation. |
I wasn't part of the background discussion here, so may be missing lots of context. But personally, I think the conclusion is unfortunate, from a serving-the-users point of view. Judging from the examples in Figure 5 of the Unicode proposal document, I don't think users would perceive the te-kerende as being separated by spaces from the surrounding words, so the natural instinct will be to type it without spaces. When they notice that this produces a joined form (because that's how lajanyalan behaves), they're just as likely to try something else such as a generic HYPHEN-MINUS or LOW LINE as to figure out that they should put spaces each side of it (and depending on the font in use, the result of adding spaces may look so bad — because lajanyalan is too long — that they reject that and go for HYPHEN-MINUS or even borrow the Arabic-script KASHIDA instead). My suspicion is that "correct" use of (The "hyphen" comparison isn't very persuasive, IMO. I notice that what's actually in your |
One of the things driving this discussion was that i looked at a number of online texts to figure out what users do, and they all used It may be better to move this discussion to a separate issue focused on line-breaking for te-kerende. (in my earlier comment i have just changed 'hyphens' to 'dashes in English', which i intend to cover hyphens and other dashes.) |
I'm not sure why they would do that, or why they would try not to use spaces. The lajanyalan is the N'Ko equivalent of the Arabic tatweel (which i assume you mean by kashida). And using it without spaces would immediately produce incorrect results, because (a) it would join with the adjacent characters (as would the tatweel), and (b) it wouldn't produce the gaps either side which always appear with te-kerende. So i don't think that users are likely to omit the spaces. (That said, for fine typography, they may perhaps choose slightly smaller spaces.) |
These things are always weird. I think if the te-kerende were encoded from the get-go, it would have been used readily. However, without it the most convenient thing to do is |
The Unicode Proposal for inclusion of te-kerende describes it as:
and gives the following examples:
It has been suggested that 'link compounds together' is not related to compound nouns, but is rather a special kind of distributional construction that N’ko authors sometimes mark this way. Can anyone explain this usage in a little more detail or provide me with some better wording for the lreq doc?
The text was updated successfully, but these errors were encountered: