Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

auto-reconnect not connecting when it is needed for capabilites access #1698

Open
kenkendk opened this issue Jan 1, 2025 · 17 comments
Open

Comments

@kenkendk
Copy link

kenkendk commented Jan 1, 2025

FTP Server OS: Any

FTP Server Type: Any

Client Computer OS: Any

FluentFTP Version: 52.0.0

Framework: .NET 8

This is not really an error with FluentFTP, but more a confusing auto-feature, that was hard to isolate because it "mostly works". Only in some cases where a retry (outside of FluentFTP) is attempted, this would get triggered. This got mixed up in #1690 because that is one way to trigger it.

I distilled it down to the following code that works:

var client = new AsyncFtpClient() { ... };
var dir = await client.GetWorkingDirectory();
await client.UploadStream(...);

But if you do not read the working directory, you get an exception with the message "Please call Connect() before trying to read the Capabilities!":

var client = new AsyncFtpClient() { ... };
await client.UploadStream(...);

It is of course an error not to call await client.Connect(), so one could argue that this is as intended.
But the fact that some methods will automatically (re-)connect and others not makes it confusing. In my case it was hard to pin down, because the List method appears to connect automatically.

I would suggest either making all methods auto-connect or all methods not auto connecting.
Alternatively, you could consider checking the state of the client on the various calls to give an earlier exception/stack-trace that makes it clear it is the caller making the mistake, and not an internal check that fails.

@FanDjango
Copy link
Collaborator

But the fact that some methods will automatically (re-)connect and others not makes it confusing

I agree.

I would suggest either making all methods auto-connect or all methods not auto connecting

I have spent very many hours myself tracking down the sort of problems you mention.

not auto connecting would be my choice.

The only kind of "hidden" or "secret" re-connect then left over, would be the one that happens in the case of stale data being left over on the socket prior to the next command that needs to be sent to the server, although even that reconnect is a sort of kludge to clear up strange connection states. Getting that to work was up to now quite difficult, think about re-creating the state of the control session, the current directory, the current transfer type (binary/ascii) etc. Any custom servers (like z/OS, for example) might introduce more things that would need to be restored.

It was not my decision to do these things, auto-connect, auto-re-connect. They have a number of advantages, but meh.

The intention was to make things easier for the user, so he does not need to do the kind of checking he isn't really familiar with, FTP being such a strange protocol which is also not implemented consistently across all servers regarding the RFCs.

But the implementation is not totally clean, and thus you have admirably described the current state of affairs. Thanks for that.

The real question is: What to do?

I know of users who code their download by creating the client, calling the download API and that's that. They don't even know anything about "Connecting" and much less about a possible need to "Disconnect". That was really a good thing: Isolating the user from even knowing there is a protocol.

@FanDjango
Copy link
Collaborator

This got mixed up in #1690 because that is one way to trigger it.

As per this, I have yet to actually get enough information to actually understand what caused that problem in your case.

Some source code, a detailed FluentFTP log and a good pair of eyes, and lo and behold, one arrives at an explanation. But I feel like I am being fed little problem tidbits taste by taste and my crystal ball is very cloudy. I don't understand the connection of this issue to #1960.

@kenkendk
Copy link
Author

kenkendk commented Jan 1, 2025

The real question is: What to do?

I see your point, most people do not understand FTP and the logic is also not really easy to understand if you are used to stateless requests (like HTTP).

If you view it this way, I would argue that the Connect() call should be removed from the API and all requests should auto (re-)connect.

If you view it as an FTP library (giving abstract access to an FTP server), I would argue that the auto-connect feature should be removed, and the caller is responsible for state management.

A solution could be to support both, by having a AutoConnect flag in the config, that would toggle either mode. If that existed, I am pretty sure most usage would have it "on", suggesting that this is the right path?

I don't understand the connection of this issue to #1960.

What happened here was that the underlying socket would time out, closing the connection. Because the UploadStream call does not auto-connect, this would cause the "Please call Connect()" error message to appear when creating a new connection for retrying the upload, without calling any auto-connecting methods.

You do not need to understand the connection, I was just trying to clear up why the other issue got so messy.

@FanDjango
Copy link
Collaborator

FanDjango commented Jan 1, 2025

You do not need to understand the connection, I was just trying to clear up why the other issue got so messy.

Ok, I see.

A solution could be to support both, by having a AutoConnect flag in the config

I am thinking about that possibility too.

Your two alternative szenarios, that you described, are an exact description of the real word - we have both kinds of users. Although I do not see what type are in the majority.

So: We could create one or two flags, defaulting to the "easy" mode, which steer the behaviour.

AutoConnect.Never                Never autoconnect nor autoreconnect
AutoConnect.OnlyReconnect        Only autoconnect when connection drops
AutoConnect.Always               Autoconnect whenever needed <----DEFAULT

Should put this on the TODO list.

@FanDjango
Copy link
Collaborator

FanDjango commented Jan 1, 2025

Once again quick solutions may be suspect, after some looking at the code, I disagree with the analysis above.

Fact is:

All calls to the API that end in a call to Execute(...) (attempts to send a command on the control connection) will enact a (re-)connect if no connection exists.. ALL OF THEM (unless there is a bug in that logic, even there).

So if you create a client, it is not connected at first.

Call an API and you will be connected.

Unless there is a bug somewhere, and actually, there is.

Any API that checks the servers capabilites before you are connected will get this error.

Some other APIs might exhibit other errors, this needs to be checked, perhaps.

Because those capabilities are only set when a connect takes place.

That is the actual error which also is a fixable error. The workaround of course, is to connect after creating a client, as you said.

My fix, some time in the future, will be to connect if someone calls an API that needs to query capabilites before beginning its work, the actual bug.

@FanDjango FanDjango added bug and removed discussion labels Jan 1, 2025
@FanDjango FanDjango changed the title Confusing auto-connect logic auto-connect not autoconnecting when it is needed for capabilites access Jan 1, 2025
@FanDjango
Copy link
Collaborator

Oh, but I still believe this here is a good idea and needs doing:

AutoConnect.Never                Never autoconnect nor autoreconnect
AutoConnect.OnlyReconnect        Only autoconnect when connection drops
AutoConnect.Always               Autoconnect whenever needed <----DEFAULT

@FanDjango
Copy link
Collaborator

Using your code (well, using UploadFile instead of UploadStream) without doing a connect:

			await using var client = new AsyncFtpClient("bla", "bla", "bla");
			await client.UploadFile(@"D:\testdir\gentoo_root.img", @"/home/bla/test/gentoo_root.img", FtpRemoteExists.Overwrite, false, FtpVerify.None);
		}

I do not get the error message you stated (in this issue and in others).
I get:

>         UploadFile("D:\testdir\gentoo_root.img", "/home/bla/test/gentoo_root.img", Overwrite, False, None)
Warning:  Reconnect needed due to disconnected control connection
Status:   Command stashed: PWD
>         Connect(True)
Warning:  Cannot reconnect, reverting to full connect
Status:   FluentFTP 52.0.0.0(.NET 5.0) AsyncFtpClient
Status:   Connecting(async) AsyncFtpClient.FtpSocketStream(control) IP #1 = ***:21
Status:   Waiting for a response
Response: 220 ProFTPD Server (Debian) [::ffff:192.168.1.109] [739252,492d]
Status:   Detected FTP server: ProFTPD
Command:  AUTH TLS
Status:   Waiting for response to: AUTH TLS
Response: 234 AUTH TLS successful [4ms]
Status:   FTPS authentication successful, lib = .NET SslStream, cipher suite = Tls12 (Aes256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, 44550, 255) [45ms]
Command:  USER ***
Status:   Waiting for response to: USER ***
Response: 331 Password required for *** [2ms]
Command:  PASS ***
Status:   Waiting for response to: PASS ***
Response: 230 User *** logged in [26ms]
Command:  PBSZ 0
Status:   Waiting for response to: PBSZ 0
Response: 200 PBSZ 0 successful [1ms]
Command:  PROT P
Status:   Waiting for response to: PROT P
Response: 200 Protection set to Private [1ms]
Command:  FEAT
Status:   Waiting for response to: FEAT
Response: 211-Features:
Response: AUTH TLS
Response: CCC
Response: CLNT
Response: CSID
Response: EPRT
Response: EPSV
Response: HASH CRC32;MD5;SHA-1*;SHA-256;SHA-512;
Response: HOST
Response: LANG en-US.UTF-8*;en-US
Response: MD5
Response: MDTM
Response: MFF modify;UNIX.group;UNIX.mode;
Response: MFMT
Response: MLST modify*;perm*;size*;type*;unique*;UNIX.group*;UNIX.groupname*;UNIX.mode*;UNIX.owner*;UNIX.ownername*;
Response: PBSZ
Response: PROT
Response: RANG STREAM
Response: REST STREAM
Response: SITE COPY
Response: SITE MKDIR
Response: SITE RMDIR
Response: SITE SYMLINK
Response: SITE UTIME
Response: SIZE
Response: SSCN
Response: TVFS
Response: UTF8
Response: XCRC
Response: XMD5
Response: XSHA
Response: XSHA1
Response: XSHA256
Response: XSHA512
Response: 211 End [3ms]
Status:   Text encoding: System.Text.UTF8Encoding+UTF8EncodingSealed
Command:  OPTS UTF8 ON
Status:   Waiting for response to: OPTS UTF8 ON
Response: 200 UTF8 set to on [1ms]
Command:  SYST
Status:   Waiting for response to: SYST
Response: 215 UNIX Type: L8 [<1ms]
Status:   Active ServerHandler is: ProFTPD
Status:   Listing parser set to: Machine
Command:  PWD
Status:   Waiting for response to: PWD
Response: 257 "/home/bla" is the current directory [1ms]

Status:   Executing stashed command

Command:  PWD
Status:   Waiting for response to: PWD
Response: 257 "/home/bla" is the current directory [<1ms]
Status:   AutoNavigate to: "/home/bla/test"
>         SetWorkingDirectory("/home/bla/test")
Command:  CWD /home/bla/test
Status:   Waiting for response to: CWD /home/bla/test
Response: 250 CWD command successful [1ms]
>         FileExists("gentoo_root.img")
Command:  SIZE /home/bla/test/gentoo_root.img
Status:   Waiting for response to: SIZE /home/bla/test/gentoo_root.img
Response: 550 SIZE not allowed in ASCII mode [1ms]
Command:  TYPE I
Status:   Waiting for response to: TYPE I
Response: 200 Type set to I [1ms]
Command:  SIZE /home/bla/test/gentoo_root.img
Status:   Waiting for response to: SIZE /home/bla/test/gentoo_root.img
Response: 213 2147483648 [<1ms]
>         DeleteFile("gentoo_root.img")
Command:  DELE gentoo_root.img
Status:   Waiting for response to: DELE gentoo_root.img
Response: 250 DELE command successful [16ms]
>         OpenWrite("gentoo_root.img", Binary, -1, False)
>         OpenDataStreamAsync("STOR gentoo_root.img", 0)
>         OpenPassiveDataStreamAsync(PASV, "STOR gentoo_root.img", 0)
Command:  PASV
Status:   Waiting for response to: PASV
Response: 227 Entering Passive Mode (***,***,***,***,134,71). [1ms]
Status:   PASV advertised a non-routable IPAD. Using original connect dnsname/IPAD
Status:   Connecting(async) AsyncFtpClient.FtpSocketStream(data) IP #1 = ***:34375
Command:  STOR gentoo_root.img
Status:   Waiting for response to: STOR gentoo_root.img
Response: 150 Opening BINARY mode data connection for gentoo_root.img [2ms]
Status:   FTPS authentication successful, lib = .NET SslStream, cipher suite = Tls12 (Aes256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, 44550, 255) [3ms]
Status:   Uploaded 2147483648 bytes, 29,436s, 69,6 MB/s
Status:   Disposing(async) AsyncFtpClient.FtpSocketStream(data)
Status:   Waiting for response to: STOR gentoo_root.img
Response: 226 Transfer complete [29,457s]
Status:   AutoNavigate-restore to: "/home/bla"
>         SetWorkingDirectory("/home/bla")
Command:  CWD /home/bla
Status:   Waiting for response to: CWD /home/bla
Response: 250 CWD command successful [1ms]
>         DownloadFile("D:\testdir\gentoo_root.img", "/home/bla/test/gentoo_root.img", Overwrite, None)
Status:   AutoNavigate to: "/home/bla/test"
>         SetWorkingDirectory("/home/bla/test")
Command:  CWD /home/bla/test
Status:   Waiting for response to: CWD /home/bla/test
Response: 250 CWD command successful [2ms]
>         OpenRead("gentoo_root.img", Binary, 0, 0, False)
>         GetFileSize("gentoo_root.img", -1)
Command:  SIZE gentoo_root.img
Status:   Waiting for response to: SIZE gentoo_root.img
Response: 213 2147483648 [1ms]
>         OpenDataStreamAsync("RETR gentoo_root.img", 0)
>         OpenPassiveDataStreamAsync(PASV, "RETR gentoo_root.img", 0)
Command:  PASV
Status:   Waiting for response to: PASV
Response: 227 Entering Passive Mode (***,***,***,***,159,245). [1ms]
Status:   PASV advertised a non-routable IPAD. Using original connect dnsname/IPAD
Status:   Connecting(async) AsyncFtpClient.FtpSocketStream(data) IP #1 = ***:40949
Command:  RETR gentoo_root.img
Status:   Waiting for response to: RETR gentoo_root.img
Response: 150 Opening BINARY mode data connection for gentoo_root.img (2147483648 bytes) [1ms]
Status:   FTPS authentication successful, lib = .NET SslStream, cipher suite = Tls12 (Aes256, TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, 44550, 255) [1ms]
Status:   Downloaded 2147483648 bytes, 18,634s, 109,9 MB/s
Status:   Disposing(async) AsyncFtpClient.FtpSocketStream(data)
Status:   Waiting for response to: RETR gentoo_root.img
Response: 226 Transfer complete [18,737s]
Status:   AutoNavigate-restore to: "/home/bla"
>         SetWorkingDirectory("/home/bla")
Command:  CWD /home/bla
Status:   Waiting for response to: CWD /home/bla
Response: 250 CWD command successful [<1ms]
>         Disconnect()
Command:  QUIT
Status:   Waiting for response to: QUIT
Response: 221 Goodbye. [<1ms]
Status:   Disposing(async) AsyncFtpClient.FtpSocketStream(control)
>         DisposeAsync()
Status:   Disposing(async) AsyncFtpClient
>         Disconnect()
Status:   Connection already closed, nothing to do.
Status:   Disposing(async) AsyncFtpClient.FtpSocketStream(control) (redundant)

So, you can see that your error message must be issued in some other context. Without some detailed information there is hardly any way to clear up these mixed up issues. And we are forever going off on some tangent or wild goose chase using the snippets of information and wild guesses. You've even got me doing it too...

Of course, the question remains open: Exactly what sequence of events led to the error message "Please call connect() etc. etv." being issued"?

So IMHO we should focus on that as a first priority, before we talk about what to do with AutoConnect yes/no.

I feel that we don't have the total picture yet.

Notice that there is a lot of PWD / CWD activity in my log. That's because the client.Config.Autonavigate setting is in effect. That causes the first command to be PWD, which then causes the (automatic) Connect.

If I had a log from you, and/or detailed config information, I might be able to see that something else or some other sequence is happening, where the capabilities are queried before any connect takes places.

This way, I am spending hours with conjecture and trying to recreate theoretical scenarios to no avail.

@FanDjango FanDjango changed the title auto-connect not autoconnecting when it is needed for capabilites access auto-reconnect not connecting when it is needed for capabilites access Jan 2, 2025
@FanDjango
Copy link
Collaborator

FanDjango commented Jan 3, 2025

Because this continues to worry me, I tried to follow your example code more exactly.

Looks like I found the actual reason for the "Please call connect() etc. etv." being issued"? message.

It seems you are probably using client.Config.AutoNavigate.Manual, which turns the modus of FluentFTP directory navigation into legacy mode. Oh, and that AutoNavigate mode is the default.

In this mode, the problem you described becomes apparent. I am wondering what other possible way one could provoke this, but none come to mind right away.

I'll try to fix that in a good way but it won't be easy.

>         UploadFile("D:\testdir\gentoo_root.img", "/home/bla/test/gentoo_root.img", Overwrite, False, None)
>         FileExists("/home/bla/test/gentoo_root.img")
Status:   Error encountered uploading file
>         DisposeAsync()
Status:   Disposing(async) AsyncFtpClient
>         Disconnect()
Status:   Connection already closed, nothing to do.
FluentFTP.Exceptions.FtpException
  HResult=0x80131500
  Message=Error while uploading the file to the server. See InnerException for more info.
  Source=FluentFTP
....
....
Inner Exception 1:
FtpException: Please call Connect() before trying to read the Capabilities!

@FanDjango
Copy link
Collaborator

@kenkendk You might want to review #683 and look at PR #1704

@kenkendk
Copy link
Author

I managed to recreate it again. This is the code with the parameters:

private sealed class DiagnosticsLogger : IFtpLogger
{
  public void Log(FtpLogEntry entry)
  {
    Console.WriteLine(entry.Message);
  }
}

var config = new FtpConfig
{
    DataConnectionType = FtpDataConnectionType.AutoPassive,
    EncryptionMode = None,
    SslProtocols = None,
    LogToConsole = true
};

var client = new AsyncFtpClient
{
    Host = "127.0.0.1",
    Port = 2121,
    Credentials = new NetworkCredential() { Username = "user", Password="password" },
    Config = config,
    Logger = new DiagnosticsLogger()
};

Server is Docker instance of SFTPGo:

docker run --env=SFTPGO_FTPD__BINDINGS__0__PORT=2121 --env=SFTPGO_FTPD__PASSIVE_PORT_RANGE__START=50000 --env=SFTPGO_FTPD__PASSIVE_PORT_RANGE__END=50009  --volume=/.../data:/srv:rw  -p 2022:2022 -p 2121:2121 -p 50000:50000 -p 50001:50001 -p 50002:50002 -p 50003:50003 -p 50004:50004 -p 50005:50005 -p 50006:50006 -p 50007:50007 -p 50008:50008 -p 50009:50009 -p 8080:8080 -d drakkan/sftpgo:latest

Output from the run is:

>         UploadStream("/dmg-bg.png", Overwrite, False)

# UploadStream("/dmg-bg.png", Overwrite, False)
>         FileExists("/dmg-bg.png")

# FileExists("/dmg-bg.png")
Error encountered uploading file
Status:   Error encountered uploading file
>         Dispose()

# Dispose()
Warning: sync dispose called for AsyncFtpClient object
Status:   Warning: sync dispose called for AsyncFtpClient object
>         DisposeAsync()

# DisposeAsync()
Disposing(async) AsyncFtpClient
Status:   Disposing(async) AsyncFtpClient
>         Disconnect()

# Disconnect()
Connection already closed, nothing to do.
Status:   Connection already closed, nothing to do.
Command failed: Error while uploading the file to the server. See InnerException for more info.

FluentFTP.Exceptions.FtpException: Error while uploading the file to the server. See InnerException for more info.
 ---> FluentFTP.Exceptions.FtpException: Please call Connect() before trying to read the Capabilities!
   at FluentFTP.Client.BaseClient.BaseFtpClient.get_Capabilities()
   at FluentFTP.Client.BaseClient.BaseFtpClient.HasFeature(FtpCapability cap)
   at FluentFTP.AsyncFtpClient.FileExists(String path, CancellationToken token)
   at FluentFTP.AsyncFtpClient.UploadFileInternalAsync(Stream fileData, String localPath, String remotePath, Boolean createRemoteDir, FtpRemoteExists existsMode, Boolean fileExists, Boolean fileExistsKnown, IProgress`1 progress, CancellationToken token, FtpProgress metaProgress)
   --- End of inner exception stack trace ---
   at FluentFTP.AsyncFtpClient.UploadFileInternalAsync(Stream fileData, String localPath, String remotePath, Boolean createRemoteDir, FtpRemoteExists existsMode, Boolean fileExists, Boolean fileExistsKnown, IProgress`1 progress, CancellationToken token, FtpProgress metaProgress)
   at FluentFTP.AsyncFtpClient.UploadStream(Stream fileStream, String remotePath, FtpRemoteExists existsMode, Boolean createRemoteDir, IProgress`1 progress, CancellationToken token)
   at Duplicati.Library.Backend.FTP.PutAsync(String remotename, Stream input, CancellationToken cancelToken) in /source/duplicati/Duplicati/Library/Backend/FTP/FTPBackend.cs:line 448
   at Duplicati.Library.Backend.FTP.PutAsync(String remotename, String localname, CancellationToken cancelToken) in /source/duplicati/Duplicati/Library/Backend/FTP/FTPBackend.cs:line 477
   at Duplicati.Library.Backend.FTP.PutAsync(String remotename, String localname, CancellationToken cancelToken) in /source/duplicati/Duplicati/Library/Backend/FTP/FTPBackend.cs:line 477
   at Duplicati.Library.Utility.Utility.Await(Task task) in /source/duplicati/Duplicati/Library/Utility/Utility.cs:line 1455
   at Duplicati.CommandLine.BackendTool.Program.Main(String[] _args) in /source/duplicati/Duplicati/CommandLine/BackendTool/Program.cs:line 170

@FanDjango
Copy link
Collaborator

Looks like the connection is unexpectedly closed at the time you issue

# UploadStream("/dmg-bg.png", Overwrite, False)
>         FileExists("/dmg-bg.png")

# FileExists("/dmg-bg.png")
Error encountered uploading file
Status:   Error encountered uploading file

FileExists() wants access the capabilites of the server, which are thus not accessible, unless it is a **re-**connect, in case of which we already know the capabilites.

Log snippets again instead of a complete log. There are many debug messages especially in the connect logic, which would tell us that.

Look: I've done all I can on this and the other issue (#1686) and even put together a PR (#1704) for making this more configurable. It has been approved and I will merge it soon into master, still thinking about what the default behaviour should be. Once merged, some time later it will be released.

@FanDjango
Copy link
Collaborator

Note:

Status: Warning: sync dispose called for AsyncFtpClient object

This tells me, that you are not using the async client with an await.

await using.... {
}

@kenkendk
Copy link
Author

This tells me, that you are not using the async client with an await.

Yes, we have some legacy surrounding Dispose() so all disposing is done synchronously.

FileExists() wants access the capabilites of the server, which are thus not accessible, unless it is a **re-**connect, in case of which we already know the capabilites.

Yes, and because I did not call Connect() I think it is treated as **re-**connect, but there was never a connect, so the capabilities are not known.

Log snippets again instead of a complete log. There are many debug messages especially in the connect logic, which would tell us that.

This is the full log, everything that was printed was exactly that. From what I see, it never connects to the server.
If there is a "extra verbose log" flag that I can add, I can try that.

@FanDjango
Copy link
Collaborator

Yes, we have some legacy surrounding Dispose() so all disposing is done synchronously.

Ok, fine with that.

Yes, and because I did not call Connect() I think it is treated as **re-**connect, but there was never a connect, so the capabilities are not known.

Good, so that explains it well.

This is the full log, everything that was printed was exactly that. From what I see, it never connects to the server.
If there is a "extra verbose log" flag that I can add, I can try that.

Ok, so the entire sequence starts with UploadStream. That was for me hard to believe so I assumed there was stuff prior to that missing.

Well, with the new PR, when it is finally put in production - you can perhaps choose to specifiy client.Config.SelfConnectMode = FtpSelfConnectMode.OnConnectionLost.

Which brings us back to the original discussion

I would suggest either making all methods auto-connect or all methods not auto connecting.

which the new PR then cleanly adresses.

I am still putting some finishing touches to it but it's nearly there....

@FanDjango
Copy link
Collaborator

It's merged. Any way you could use it prior to it being released?

@kenkendk
Copy link
Author

Any way you could use it prior to it being released?

Not really unfortunately. I am not the end-user myself, so I do not have the issue directly.
I can test the update, but after I added the Connect() call, I have not had any issues.
I reverted my change to avoid calling Connect() to provide the extra data listed here.

@FanDjango
Copy link
Collaborator

Ok. I am already using this internally here and I am confident it will help your app.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants