-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IAWG: First pass at publish_lookup chapter #398
base: master
Are you sure you want to change the base?
Conversation
Chap_API_Publish.tex
Outdated
\item The the requester falls within the range specified by the publisher. | ||
\item If the publisher specified access permissions, the effective \ac{UID} and \ac{GID} of the requester must meet those requirements. | ||
\end{enumerate} | ||
|
||
Fig.~\ref{fig:publish_lookup} shows process P0 executing on host X in namespace 1 in session A publishing the key "tree" on three ranges. Publishing to different ranges is allowed. | ||
Fig.~\ref{fig:publish_lookup} shows process P0 executing on Host X in Namespace 1 in Session A publishing the key "tree" on three ranges. Publishing to different ranges is allowed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't been able to look at the figure, but I suspect this opens up a real can of worms. The problem is that the "namespace" range is encompassed within the "session" range, and so now (as you point out later in the text) there is ambiguity as to which value of "tree" is intended. I don't think we want to make publish/lookup that complex.
Keep in mind that this is a rarely used feature. It's intent is not to support detailed database-like operations, but rather to assist in async rendezvous or coordination operations. Thus, we don't need complex store/retrieve rules like you see with SQL.
We could just require that all keys be globally unique, but that could cause some difficulty for app-to-app rendezvous (e.g., "ocean" trying to rendezvous with "air"). Attempting to prefix keys with user or job IDs helps, but it still imposes a requirement that the other app know the information used to prefix the key - and that might not always be doable.
Perhaps the best option is to just require that you specify the range from which the data is to be retrieved. Thus, if you want the key value that was published on the "session" range, you have to ask for the "session" range. If you want the value published on the "namespace" range, then you have to ask for that range. If you ask for the value published on the "namespace" range and it wasn't published there (e.g., it was published only on the "session" range), then you get "not found".
Or something like that - my point being that this sounds too complex and is likely more than what we really need or want to support. We could even just eliminate the ability to publish the same key on different ranges - I have no issue with doing so, and I can't think of anyone currently using this API who would be impacted by it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I was just trying to capture what I thought the document says currently, but I agree it is very complex to understand and to implement and it still has non-determinant cases (results that may vary based on implementation). I'm all for making something that is as simple as possible both to understand and implement, even if that means it is not very rich in features. Like you said, this is just meant to be a simple rendezvous mechanism. I need to think more about what options we have and what we could do without breaking backward compatibility. This is one of those cases where I don't think anyone is invested in the way things "work" now, but not sure how much liberty we should take on changing behavior.
On another note, I would love to see the Publish API take 2 info arrays... one for the things to publish and one for modifiers to the call. As it is now if I call Publish with an array of infos like:
PMIX_RANGE=PMIX_RANGE_NAMESPACE
tree=oak
PMIX_RANGE=PMIX_RANGE_GLOBAL
tree=elm
PMIX_TIMEOUT=10
then we have several problems: If the implementation doesn't handle or recognize PMIX_TIMEOUT then it will actually publish PMIX_TIMEOUT as a key since it doesn't recognize it as a directive. Also, will it publish tree=oak at the namespace range and tree=elm at the Global range or will it publish only one of the values of tree at some range (either namespace or global)? Things get really tricky when the things to publish are combined with the directives.
But deprecating existing API's for new Publish API's seems like a lot of work for something that very few products are using and probably have figured out a way to not get tripped up by these things already. I don't want the pmix-standard to be a burden to implementations or progress, but I would like it to describe how things actually work so someone can develop new clients without getting lost or confused. I'll see if I can think up some way to simplify what is supported without breaking backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So don't deprecate the API, but add an attribute for specifying modifiers to the call. It would take a pmix_data_array_t
containing an array of pmix_info_t
that has the modifiers in them. I don't think that would be much of a problem for backwards compatibility.
If you prefer, you could add an attribute for the things to publish, leaving all else in the call to be modifiers. Either way should be fine.
I wouldn't worry too much about changes in behavior at this point. OMPI is the primary user of it (for connect/accept), and I've already outlined a way for them to use PMIx_Group
instead - which is a far better and more performant solution for that use-case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that idea. We could have both approaches for a while, then move towards deprecating the old way after a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking some more, we should probably add the attribute for the things to publish (as opposed to the modifiers) so we remain consistent with the other APIs - whose info arrays all contain modifiers.
Chap_API_Publish.tex
Outdated
|
||
The figure shows a similar scenario for P2. However, unlike P1, P2 is in the same namespace as P0 which results in two possible published values of | ||
"tree" that could be returned by a call to \refapi{PMIx_Lookup} by P2 on the range \refconst{PMIX_RANGE_SESSION} or \refconst{PMIX_RANGE_NAMESPACE}. | ||
An implementation is required to return one of the possible matches in such cases where |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think we have to preclude this possibility - it should be an error.
Chap_API_Publish.tex
Outdated
@@ -73,7 +72,8 @@ \section{\code{PMIx_Publish}} | |||
The blocking form of this call will block until it has obtained confirmation from the datastore that the data is available for lookup. The \refarg{info} array can be released upon return from the blocking function call. | |||
|
|||
Publishing duplicate keys is permitted provided they are published to different | |||
ranges. Duplicate keys being published on the same data range shall return the | |||
ranges. Custom ranges are consider different if they have different members. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ranges. Custom ranges are consider different if they have different members. | |
ranges. Custom ranges are considered different if they have different members. |
Chap_API_Struct.tex
Outdated
The namespace and rank of the process that provided each data element is also returned. | ||
|
||
Note that the \refstruct{pmix_pdata_t} structures will be released upon return from the callback | ||
function, so the receiver must copy/protect the data prior to returning if it needs to be retain ed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function, so the receiver must copy/protect the data prior to returning if it needs to be retain ed. | |
function, so the receiver must copy/protect the data prior to returning if it needs to be retained. |
fe6dd50
to
de8c3be
Compare
ae194c3
to
b483abc
Compare
Please use emoji reactions ON THIS COMMENT to indicate your position on this proposal.
Here are the meanings for the emojis:
|
b483abc
to
5ca709f
Compare
@dsolt Are all of the comments in this PR resolved? (If so then can you mark them as resolved?) |
Chap_API_Publish.tex
Outdated
|
||
publishing to a \refconst{PMIX_RANGE_CUSTOM} | ||
range which does not include the publisher will prevent | ||
any processes from using \refapi{PMIx_Lookup} to access the published data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just making this undefined behavior/incorrect code? This doesn't look useful at first glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually not sure that statement is correct - the range determines who can access it. I'm not sure why the publisher would need to be a member of that group. The publisher always "owns" the data and can "unpublish" it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember @dsolt had some slides on this that were presented in the Q2 meeting. From the notes I have the following
publish/lookup supports multiple ranges
validity of publishing same key twice on the same range (WG will look more into it, it is believed to be illegal code with undefined behavior)
Group prefers solution where LOOKUP queries only the range it is given (i.e., to be visible, the PUBLISH must target the same range). No ‘inheritance’ of published items. The same key may be published on different range
So I think this was targeted at addressing those comments, but my memory is failing me on the details. We will discuss this more today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. At one point in our revision process we had determined that the publisher and process calling Lookup both had to be part of the group, but we decided to remove that restriction and allow the publisher to not be part of the group of processes, so this should have been removed when we made that switch.
@@ -195,25 +230,33 @@ \subsection{Range of Published Data} | |||
Undefined range. | |||
% | |||
\declareconstitemvalue{PMIX_RANGE_RM}{1} | |||
Data is intended for the host environment, or lookup is restricted to data published by the host environment. | |||
Published data and generated events are restricted to processes executing under the same instance of the host environment as the publisher or event creator. Lookup of data is restricted to data published by processes running under the same instance of the host environment as the requester. | |||
% | |||
\declareconstitemvalue{PMIX_RANGE_LOCAL}{2} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading this text I stumbled again on the question of wether ranges are "nested" or not. The new text in PMIx_Lookup
below makes it very clear that it is not, but some forewarning here (or a ref to the lookup rules \ref{chap:pub:retrules}.
) could help clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave this one for discussion during the reading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per IAWG on Aug. 15, 2022:
- We should make clean that though the ranges are overlapping in definition, they are not nested. So ranges must match.
- We should add the forward reference in the top paragraph of this section to the lookup section where this is more clearly noted.
Chap_API_Publish.tex
Outdated
\item If all of the above checks pass, then the value is added to the information that is to be returned. | ||
\item The lookup key matches the published key. | ||
\item The type of range specified by the publisher is the same as the type of range specified by the requester. | ||
\item If a custom range is specified by the publisher and the requester, the members described in both cases must be identical. The publisher is not required to be a member of a custom range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to contradict the rule from #398 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is correct. The other location is wrong.
For the meeting today, attached is a PDF rendering of the current state of this PR. |
Chap_API_Publish.tex
Outdated
\refconst{PMIX_ERR_DUPLICATE_KEY} error. | ||
|
||
publishing to a \refconst{PMIX_RANGE_CUSTOM} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
publishing to a \refconst{PMIX_RANGE_CUSTOM} | |
Publishing to a \refconst{PMIX_RANGE_CUSTOM} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now a nop
I worked thru the chapter and made some recommended changes
Ralph
… On Aug 9, 2022, at 7:31 AM, Josh Hursey ***@***.***> wrote:
@jjhursey commented on this pull request.
In Chap_API_Publish.tex <#398 (comment)>:
> \refconst{PMIX_ERR_DUPLICATE_KEY} error.
-
+publishing to a \refconst{PMIX_RANGE_CUSTOM}
⬇️ Suggested change
-publishing to a \refconst{PMIX_RANGE_CUSTOM}
+Publishing to a \refconst{PMIX_RANGE_CUSTOM}
—
Reply to this email directly, view it on GitHub <#398 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAST5YECWDB5OWW3W4LWDL3VYJTVFANCNFSM5SQEB4SA>.
You are receiving this because you commented.
|
Chap_API_Publish.tex
Outdated
the implementation should | ||
distinguishing between info array elements that specify keys and directives as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the implementation should | |
distinguishing between info array elements that specify keys and directives as follows: | |
the implementation should | |
distinguish between info array elements that specify keys and directives as follows: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Chap_API_Publish.tex
Outdated
\item The lookup key matches the published key. | ||
\item The type of range specified by the publisher is the same as the type of range specified by the requester. | ||
\item If a custom range is specified by the publisher and the requester, the members described in both cases must be identical. The publisher is not required to be a member of a custom range. | ||
\item The requestor must be a member of the publisher's range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
\item The requestor must be a member of the publisher's range. | |
\item The requestor must be a member of the range specified by the publisher. |
|
||
\item If all of the above checks pass, then the value is added to the information that is to be returned. | ||
\item The lookup key matches the published key. | ||
\item The type of range specified by the publisher is the same as the type of range specified by the requester. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required (e.g., requestor specifies 'node' range, and publisher specifies 'global' range)? Or is it enough that the requestor be part of the publisher's range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new model is that the ranges must match... if you publish to 'global', then the requestor must specifically lookup on 'global' range.
Signed-off-by: David Solt <dsolt@us.ibm.com> - Add PMIX_DATA_TO_PUBLISH directive - Clarify how an implementation distinguished between attributes and data - require that ranges for publish and lookup match - Determine how custom ranges are handled: Publisher need not be a member of a custom range - simplified retrieval rules - Add PMIX_ACCESS_USERIDS and GRPIDS to list of attributes related to publish - move callback presentation to Chap_API_Struct chapter - move lookup structure prior to its use in the chapter
- minor grammar change - better phrasing - remove incorrect statement about publisher having to be included in custom range Signed-off-by: David Solt <dsolt@us.ibm.com>
5ca709f
to
097dd13
Compare
Chap_API_Publish.tex
Outdated
but should not be treated as data to | ||
publish. The implementation may treat any custom (non-standardized) directives it | ||
supports as directives. All other \refarg{info} array elements | ||
should it be assumed to be data to be published. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be assumed to be data to be published. | |
should be assumed to be data to be published. |
Chap_API_Publish.tex
Outdated
\refconst{PMIX_ERR_DUPLICATE_KEY} error. | ||
|
||
In some cases, implementations may be incapable of distinguishing which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note from 3Q 2022 ASC (Aug. 11, 2022):
- Let's require
PMIX_DATA_TO_PUBLISH
as the only way to publish data.- This has some backward compatibility issues.
- Open MPI and (maybe) MPICH would be impacted
- We should add an error if
PMIX_DATA_TO_PUBLISH
is not provided in the publish call. - In v5, mark not using
PMIX_DATA_TO_PUBLISH
as deprecated.
- This has some backward compatibility issues.
- Should we just create a new API with an additional required parameter for the data_to_publish (what to name it?
PMIx_Publish_data
?)- Then we just deprecate the old API instead of adding a new attribute to work around it.
pmix_status_t PMIx_Publish_NEW(const pmix_info_t info_to_publish[], size_t ninfos,
const pmix_info_t info_directives[], size_t ndirs);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per IAWG on Aug. 15, 2022:
- Add new
PMIx_Publish2
andpmix_server_publish2_fn_t
with two sets ofpmix_info_t
s - Deprecate the
PMIx_Publish
API and correspondingpmix_server_publish_fn_t
API. - Do not add the
PMIX_DATA_TO_PUBLISH
attribute. Instead, prefer the new API. - In the Revision appendix note,
PMIx_Publish
was deprecated and replaced byPMIx_Publish2
(same with server API) - No longer an issue on the ordering of info data and directives (this approach avoids this whole issue)
\declareconstitemvalue{PMIX_RANGE_CUSTOM}{6} | ||
Published data and generated events are restricted to processes | ||
described in the \refstruct{pmix_info_t} associated with this call. | ||
Lookup of data is restricted to data published by the processes described in | ||
in the \refstruct{pmix_info_t}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note from 3Q 2022 (Aug. 11, 2022)
- What does it mean for two
PMIX_RANGE_CUSTOM
s to match? Is it a strict ordering or is a subset fine? - Is the publisher required to be part of the range?
- Publisher does need to be a member of the range?
- "If I publish that data I own that data and can look it up so that I can unpublish"
- Current description would prevent the lookup but not the unpublish.
- Owner should always be able to look it up even if not in the range. Maybe need a clause here? What about overlapping keys?
- Publish means "put some data for a set of consumers to lookup". The consumers might not know who published the data. For a custom range, those looking up might not know the full group. Alternatively, the lookup could specify a list including the publisher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per IAWG on Aug. 15, 2022:
- Does the publisher have to be part of the range it publishes to?
- The publisher does not need to be a member of the range.
- The publisher is the owner of the data, so it is the only one able to unpublish the data, but must specify the range where that data was published.
- The publisher is the owner of the data but cannot lookup the data if it is not part of the range in which it was pushed.
- If I'm looking up from a
PMIX_RANGE_CUSTOM
does it have to match what the publisher provided?PMIX_RANGE_NAMESPECE
(example)- Publish: means that the data is to be available to a set of lookup'ers in my namespace.
- Lookup: means that I want data provided by a publisher in my namespace.
PMIX_RANGE_CUSTOM
- Publish: 'custom' means I want the data to be available to this set of lookup'ers. Might not include the publisher.
- Lookup: 'custom' means I want data provided by one of these publishers. Does not need to know all of the processes that have access to the data - only who is providing the data.
- Generally
PMIX_RANGE_
means:- Publish: Define who can look up the data
- Lookup: Define who published the data
- Does the range have to match on the publish and lookup?
- I'm publishing to range X, and I'm looking up on range X.
PMIX_RANGE_CUSTOM
is the only one that takes an argument list of processes. It's the only one where the publisher and lookup'er might be using the same value associated with thePMIX_RANGE_CUSTOM
key.
- What if the publisher publishes the same key (different values) to different custom ranges that both include the client looking up the key? Or two different publishers both publish the same key (different values) that include the lookup client in different
PMIX_RANGE_CUSTOM
s.- This is not allowed.
- The Server should reject the lookup operation as conflicting entries. The client should then change their custom range to include one
- What If I want to lookup a value published by a process outside of my namespace?
- Publisher will use
PMIX_RANGE_SESSION
orPMIX_RANGE_CUSTOM
(if they know the lookup'er) - Lookup'er will user
PMIX_RANGE_CUSTOM
with the publisher's name
- Publisher will use
- Future work: Maybe extend
PMIX_RANGE_NAMESPACE
with an argument of which namespace. Currently, it is the same as the caller.
Chap_API_Publish.tex
Outdated
rendezvous with each other without knowing in advance the identity of the other | ||
namespace or when that namespace might become active. | ||
|
||
process is known in advance, for example, two namespaces that do not share a child-parent relationship. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process is known in advance, for example, two namespaces that do not share a child-parent relationship. | |
process is known in advance (e.g., two namespaces that do not share a child-parent relationship). |
Chap_API_Publish.tex
Outdated
@@ -195,25 +227,33 @@ \subsection{Range of Published Data} | |||
Undefined range. | |||
% | |||
\declareconstitemvalue{PMIX_RANGE_RM}{1} | |||
Data is intended for the host environment, or lookup is restricted to data published by the host environment. | |||
Published data and generated events are restricted to processes executing under the same instance of the host environment as the publisher or event creator. Lookup of data is restricted to data published by processes running under the same instance of the host environment as the requester. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note from 3Q 2022 (Aug. 11, 2022)
- Old text was intended. That the published data is intended solely for the host environment not for application/tool processes.
- Suggest leave the old text
PMIX_RANGE_GLOBAL
: Anyone under that RM instance (basically what the new text is saying)
3Q 2022:
|
Larger changes are still coming Signed-off-by: David Solt <dsolt@us.ibm.com>
Here is the latest draft with the changes from the first reading (i.e. introduce a PMIx_Publish2, get rid of PMIX_DATA_TO_PUBLISH, and specify that the range of lookup is used to determine eligible publishers). Will update the PR soon after WG has had the chance to review. |
Signed-off-by: David Solt <dsolt@us.ibm.com>
Signed-off-by: David Solt <dsolt@us.ibm.com>
…cription Signed-off-by: David Solt <dsolt@us.ibm.com>
PMIx ASC 4Q 2022 Meeting presented a straw poll regarding:
Three options were presented:
Sorted by choice preference
Other comments:
|
PMIx ASC 4Q 2022 day 2 meeting discussed this further and outlined a path forward. See the notes below |
Notes from IAWG meeting Oct. 31, 2022
|
List of items to deal with:
|
… to the new structures
…stead of null-terminating
I added a new commit for:
|
PMIx ASC 1Q 2023 Pleanary Notes (see Notes Day 1 for more details)
|
IAWG's first pass at changes to the Publish/Lookup chapter. This is a WIP, but far enough along that people may be interested in seeing what we are working on. Commits are not yet signed-off.