-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error 403 downloading SWEET using Protege or BioPortal #150
Comments
Initial troubleshooting log over Slack:
|
We of BIoPortal are reporting to OWLAPI developer. |
Do you have a link to the issue? See previous comments on this issue. You
have gotten this working with rdf tools but that doesn't demonstrate there
is no issue with the owl.
…On Tue, Jul 23, 2019, 02:41 John Graybeal ***@***.***> wrote:
We of BIoPortal are reporting to OWLAPI developer.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#150?email_source=notifications&email_token=AAAMMOMXJE6FNNTSHQS4A6TQAZHTDA5CNFSM4IF6RP42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2RSHMQ#issuecomment-514007986>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAMMOK37RKNZNWG6JUB56TQAZHTDANCNFSM4IF6RP4Q>
.
|
This is kind of odd, is the sweet server configured in an odd way?
The Cookie line here is suspicious:
```
$ wget http://sweetontology.net/human
…--2019-07-23 08:19:20-- http://sweetontology.net/human
Resolving sweetontology.net... 34.216.150.176
Connecting to sweetontology.net|34.216.150.176|:80... connected.
HTTP request sent, awaiting response... 200 OK
Cookie coming from sweetontology.net attempted to set domain to esipfed.org
Length: unspecified [text/turtle]
Saving to: `human'
[ <=>
] 5,354 --.-K/s in 0s
2019-07-23 08:19:20 (56.7 MB/s) - `human' saved [5354]
```
On Tue, Jul 23, 2019 at 7:31 AM Chris Mungall ***@***.***> wrote:
Do you have a link to the issue? See previous comments on this issue. You
have gotten this working with rdf tools but that doesn't demonstrate there
is no issue with the owl.
On Tue, Jul 23, 2019, 02:41 John Graybeal ***@***.***>
wrote:
> We of BIoPortal are reporting to OWLAPI developer.
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#150?email_source=notifications&email_token=AAAMMOMXJE6FNNTSHQS4A6TQAZHTDA5CNFSM4IF6RP42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2RSHMQ#issuecomment-514007986>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAAMMOK37RKNZNWG6JUB56TQAZHTDANCNFSM4IF6RP4Q>
> .
>
|
@cmungall Good observation about the domain set by cookie. Just for reference, the full set of headers as shown by httpie:
I'll see how to adjust the apache config so it sets no cookie at all or it sets .sweetontology.net as the domain. |
@cmungall I just adjusted the apache setting to remove setting any cookies. The same request/response (only headers) look like this now (no
|
I'm looking at the 403 error reported by Protégé (5.2.0). (Note: I previously mentioned "proxy-pass" as the basic mechanism for sweetontology.net/* resolution but actually this is based on Looking at
(It would be useful to see all the headers set by Protégé in this request, but the apache logging on the server is not set up for that, afaict.) However, while loading via httpie (eg.,
Somehow the request from Protégé is not triggering the |
TD;LR Just diagnosed the problem: Cloudflare(*) is the piece that is complaining with 403 when the request is done with header So, some quick suggestions:
Details To see the request headers, which is key to continue this investigation, I just enabled mod_log_forensic on the server. This is what's logged out in
The only "interesting" header is
|
I will pursue this with Matthew (Protege) and Jennifer (BioPortal). We can perhaps test it easily enough on the BioPortal side, will check. |
In BioPortal we use OWL API code that looks roughly like the following to load ontologies:
I could be wrong, but I don't believe the OWL API has any public methods that would allow us to specify headers for the loading of imports. I'll investigate this a little more when I have time. Per the suggestion from @carueda about a more recent version of Java - I executed the above snippet of code in my local development environment against Java 8 (what we currently use in BioPortal) and Java 12. Running against Java 8, I get the 403 responses:
... however, running against Java 12 eliminates the occurrence of the 403s. |
Aha! Very telling. So, likely a default setting of Java 8, in how it tries to open content off the web. There must be a default configuration in Java that can be modified to change the header settings. |
I guess you could have the java client lie about its version, but it seems
the problem is on the server configuration side. Even if you have a
bioportal-specific hack, you want this to work for everyone.
Why does cloudflare decide to reject this? Seems totally arbitrary. IMO
server behavior should be more transparent and predictable.
Someone else has complained about this but they got no response:
https://community.cloudflare.com/t/cloudflare-blocks-java-http-client/73621
thanks for the excellent sleuthing Carlos!
…On Wed, Jul 24, 2019 at 7:01 AM John Graybeal ***@***.***> wrote:
Aha! Very telling. So, likely a default setting of Java 8, in how it tries
to open content off the web. There must be a default configuration in Java
that can be modified to change the header settings.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#150?email_source=notifications&email_token=AAAMMOOKRBAHBHPQQOVNP7LQA7O2TA5CNFSM4IF6RP42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2VFRIA#issuecomment-514480288>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAMMONUK4WV333GGVAJXLLQA7O2TANCNFSM4IF6RP4Q>
.
|
You may also be interested in the obo purl system, it could be easily
adapted for sweet:
https://github.com/OBOFoundry/purl.obolibrary.org/
It can run on a tiny amazon server costing virtually nothing, and you could
just have it redirect to raw github URLs (or S3 or anything else)
…On Wed, Jul 24, 2019 at 9:28 AM Chris Mungall ***@***.***> wrote:
I guess you could have the java client lie about its version, but it seems
the problem is on the server configuration side. Even if you have a
bioportal-specific hack, you want this to work for everyone.
Why does cloudflare decide to reject this? Seems totally arbitrary. IMO
server behavior should be more transparent and predictable.
Someone else has complained about this but they got no response:
https://community.cloudflare.com/t/cloudflare-blocks-java-http-client/73621
thanks for the excellent sleuthing Carlos!
On Wed, Jul 24, 2019 at 7:01 AM John Graybeal ***@***.***>
wrote:
> Aha! Very telling. So, likely a default setting of Java 8, in how it
> tries to open content off the web. There must be a default configuration in
> Java that can be modified to change the header settings.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#150?email_source=notifications&email_token=AAAMMOOKRBAHBHPQQOVNP7LQA7O2TA5CNFSM4IF6RP42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2VFRIA#issuecomment-514480288>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAAMMONUK4WV333GGVAJXLLQA7O2TANCNFSM4IF6RP4Q>
> .
>
|
Interesting suggestion @cmungall (do you want to capture it in a separate issue?) I'm not sure about its concrete capabilities but just wondering if it can handle different ontology representations automatically, or would they have to be pre-generated and then resolved via apache/nginx or similar rules...
Agree, totally arbitrary. |
I entered a ticket in the Protege repository. (Somehow I don't think I'll get far with a ticket in the cloudflare repository, but maybe @abburgess has some sway.) BioPortal has identified a fix to this issue (upgrading to Java 11, see ncbo/bioportal-project#127), but has to finish upgrading SOLR and various other bits before the fix will be in production. I prefer keeping tickets open until the problem is resolved. Even if it isn't the fault of the COR, the user sees it in COR. I'll defer to the judgment of the community here…. |
This is a server issue. Not a protege one, not an owlapi one. the server should not block these calls. |
I agree, but from an end user perspective, the fact the cloudflare server is a fail is of no relevance, as it may never get fixed. The issue remains whether clients that use OWL API should try to find a workaround. |
I don't understand. It is of massive user relevance as many users will be using the owlapi. Are you compelled to use cloudflare? |
The entire hosting organization (ESIP) has just changed to CloudFlare, it was a significant transition. So I'll presume to say "yes". I should say there is still an 'open topic' with CloudFlare about the issue, but my impression is that they are not showing a lot of initiative to address it. Like you I would expect the impact to go far, even well beyond OWL API. |
Re: cloudflare. We're in a bit of a hold, as the person who manages this
for us has been out sick. I reached out earlier today to check-in and will
give you an update soon.
|
sweetontology.org is a separate domain, so shouldn't be hard to make an
exception? E.g. you could but the ontology files in an S3 bucket and have
the url redirect to this.
…On Tue, Sep 17, 2019 at 1:03 PM John Graybeal ***@***.***> wrote:
The entire hosting organization (ESIP) has just changed to CloudFlare, it
was a significant transition. So I'll presume to say "yes".
I should say there is still an 'open topic' with CloudFlare about the
issue, but my impression is that they are not showing a lot of initiative
to address it. Like you I would expect the impact to go well beyond OWL API.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#150?email_source=notifications&email_token=AAAMMOKKXQTE727JIKMZVBTQKEZZXA5CNFSM4IF6RP42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD65XPSA#issuecomment-532379592>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAAMMOIH4ZFFJELUCU6LEYDQKEZZXANCNFSM4IF6RP4Q>
.
|
We could yes. This would however break our watchdog script so SWEET source and the linked data representation would become out of sync. I don't have the cycles to hack together code to address this. The Cloudflare move has screwed us and I would rather invest any time I have fixing that issue tbh. |
From earlier in this thread, @cmungall said:
I mis-spoke, the problem is of significant relevance. I was trying to say that users won't care whose issue it is—for them it's a user-facing issue. So we have to consider fixing it on our end whether or not cloudflare is the problem. I think at this point we're waiting for @abburgess's update from her conversation with cloudflare. But I consider you to be the biggest driver of action on this ticket, as it affects "your" ontology. You are one of the most active developers and probably the most affected users. |
Can you jump in on this David?
Annie Burgess, PhD
Lab Director | Earth Science Information Partners (ESIP)
*esipfed.org/lab <http://esipfed.org/lab>* | 585.738.7549
*Sign up for the monthly ESIP Lab update here <http://eepurl.com/dtKL8z>.*
…On Mon, Sep 30, 2019 at 1:43 PM John Graybeal ***@***.***> wrote:
From earlier in this thread, @cmungall <https://github.com/cmungall> said:
I don't understand. It is of massive user relevance as many users will be
using the owlapi.
I mis-spoke, the problem is of significant relevance. I was trying to say
that users won't care whose issue it is—for them it's a user-facing issue.
So we have to consider fixing it on our end whether or not cloudflare is
the problem.
I think at this point we're waiting for @abburgess
<https://github.com/abburgess>'s update from her conversation with
cloudflare. But I consider you to be the biggest driver of action on this
ticket, as it affects "your" ontology. You are one of the most active
developers and probably the most affected users.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#150?email_source=notifications&email_token=ABX5PQDN2ITKPDEPTIPUKEDQMJQGHA5CNFSM4IF6RP42YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD77A4RA#issuecomment-536743492>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABX5PQFHPF7QMXSTSEOP3TLQMJQGHANCNFSM4IF6RP4Q>
.
|
Hi everyone, Apologies for the delay here, I am just getting back up to speed after a long recovery period. I haven't looked through the issue in detail, but if it is caused by cloudflare filtering, a first quick fix to try would be to disable the cloudflare proxying for the cor.esipfed.org subdomain. So cloudflare would provide the DNS registration only, and requests to cor.esipfed.org should not run through their servers. I have now done this, and it should take effect immediately - could you retest and see if this resolves the issue? Thanks, David |
Thanks @dbassendine, using Protégé (5.2.0), http://sweetontology.net/sweetAll now loads fine: |
This also fixed the sweet-alignment-manager |
Great! Sorry for the disruption caused by the transition over to Cloudflare. |
The problem has resurfaced also with robot: $ robot merge -I http://sweetontology.net/sweetAll -o download/sweet.ttl
org.semanticweb.owlapi.io.OWLOntologyCreationIOException: Server returned HTTP response code: 403 for URL: http://sweetontology.net/sweetAll
Use the -vvv option to show the stack trace.
Use the --help option to see usage information. I can confirm this is the same weird discrimination against Java/1.8.0_40 $ curl -vvv -H "User-Agent:Java/1.8.0_40" http://sweetontology.net/sweetAll
* Trying 104.27.158.188...
* TCP_NODELAY set
* Connected to sweetontology.net (104.27.158.188) port 80 (#0)
> GET /sweetAll HTTP/1.1
> Host: sweetontology.net
> Accept: */*
> User-Agent:Java/1.8.0_40
>
< HTTP/1.1 403 Forbidden
< Date: Mon, 05 Oct 2020 02:53:02 GMT
< Content-Type: text/plain; charset=UTF-8
< Content-Length: 16
< Connection: keep-alive
< Set-Cookie: __cfduid=d96784c3fc857ee41e8f517eee57ebff71601866382; expires=Wed, 04-Nov-20 02:53:02 GMT; path=/; domain=.sweetontology.net; HttpOnly; SameSite=Lax
< X-Frame-Options: SAMEORIGIN
< Cache-Control: private, max-age=0, no-store, no-cache, must-revalidate, post-check=0, pre-check=0
< Expires: Thu, 01 Jan 1970 00:00:01 GMT
< cf-request-id: 05984655c3000028273b9f8200000001
< Report-To: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report?lkg-colo=4&lkg-time=1601866383"}],"group":"cf-nel","max_age":604800}
< NEL: {"report_to":"cf-nel","max_age":604800}
< Server: cloudflare
< CF-RAY: 5dd3d99c6f4f2827-SJC
<
* Connection #0 to host sweetontology.net left intact |
I understand the hosting for sweetontology was changed about a month ago, maybe @lewismc can elaborate. |
I did a bit of experimenting. |
@dbassendine can you please repeat the following steps
Thank you |
Just for your information I encountered the same problem in an another project. It seems it's a user agent problem. See my experimentation result at redhat-developer/vscode-xml#429 (comment) |
Per the recent Slack thread (@graybeal), attached is a WebVOWL error from the Github ESIP SWEET link When testing it manually, inserting the URL on the WebVOWL site, the message reads "Error: Received empty graph" @dbassendine - After your first fix, do you see how it can be fixed again? |
Confirmed this is still an issue.
Could you please either verify the cloudflare proxying is disabled or verify that you (or someone) has disabled it again. Thank you. |
@esip-lab Annie: could you recheck on this given that access from protege remains unsuccessful and the fact that the amazon instance is now different? +cc: @rrovetto , @graybeal , @brandonnodnarb . |
I'm not sure what you want me to check here.
|
Annie, we're convinced that although we've checked with CloudFlare before and they denied it, they are causing the issue with resolution (because Protege w/Java 8 doesn't resolve the sub-ontologies correctly while BioPortal w/Java 11 does, for one example—that exactly matches the previous problem that CloudFlare fixed for a year). Brandon's comment cites an easy test to confirm whether CloudFlare is the blocker. |
Okay, I have turned OFF the proxy through Cloudflare based on Brandon's comment.
|
Thanks! To my surprise, Protege still has a problem opening https://sweetontology.net/sweetAll. Oh, but opening http://sweetontology.net/sweetAll now works, pulling in all ontologies. And, opening https://raw.githubusercontent.com/ESIPFed/sweet/master/src/sweetAll.ttl in Protege successfully pulls in all the other ontologies. (I'm not absolutely sure whether we tried this before, so we may have to try turning the proxy back on to see if it still works.) This is the error message for https:
So before we turn proxying back on and retest, does anyone want to explore or opine about the SSL not resolving? I think we should be handling https requests also. |
AFAICT, HTTPS access has never been set up ESIPFed/cor#37 (note that |
Following John, I tested the raw git link in the past and i don't recall it opening.
I recommend we also list links to open all stored versions of sweet. |
Just don't try with But http://sweetontology.net/ resolves just fine from my side, that is:
make sense? |
Thanks for all your efforts on this issue @carueda, @graybeal, @rrovetto, and @esip-lab. I have tested http://sweetontology.net/sweetAll using the I can also confirm that a browser renders http://sweetontology.net/ as https://github.com/ESIPFed/sweet/blob/master/README.md which matches the current set up/configuration. As such, I will close this issue. Please lodge any other issues mentioned in the thread separately. |
I replaced the text that is no longer relevant now that #150 is closed I think there should be more text about how to use sweet programmatically, but I'll make a new issue for this
Neither Protege nor BioPortal could open SWEET version 3.3.0 (http://sweetontology.net/sweetAll), getting 403 errors when attempting to follow the redirects.
Initial troubleshooting suggested OWLAPI was a principle component of concern, since other tools could open SWEET OK.
The text was updated successfully, but these errors were encountered: