Package org.apache.nutch.protocol.ftp
Class Client
- java.lang.Object
-
- org.apache.commons.net.SocketClient
-
- org.apache.commons.net.ftp.FTP
-
- org.apache.nutch.protocol.ftp.Client
-
public class Client extends org.apache.commons.net.ftp.FTP
Client.java encapsulates functionalities necessary for nutch to get dir list and retrieve file from an FTP server. This class takes care of all low level details of interacting with an FTP server and provides a convenient higher level interface. Modified from FtpClient.java in apache commons-net. Notes by John Xing: ftp server implementations are hardly uniform and none seems to follow RFCs whole-heartedly. We have no choice, but assume common denominator as following: (1) Use stream mode for data transfer. Block mode will be better for multiple file downloading and partial file downloading. However not every ftpd has block mode support. (2) Use passive mode for data connection. So Nutch will work if we run behind firewall. (3) Data connection is opened/closed per ftp command for the reasons listed in (1). There are ftp servers out there, when partial downloading is enforced by closing data channel socket on our client side, the server side immediately closes control channel (socket). Our codes deal with such a bad behavior. (4) LIST is used to obtain remote file attributes if possible. MDTM and SIZE would be nice, but not as ubiquitously implemented as LIST. (5) Avoid using ABOR in single thread? Do not use it at all. About exceptions: Some specific exceptions are re-thrown as one of FtpException*.java In fact, each function throws FtpException*.java or pass IOException.- Author:
- John Xing
-
-
Field Summary
-
Fields inherited from class org.apache.commons.net.ftp.FTP
_commandSupport_, _controlEncoding, _controlInput_, _controlOutput_, _newReplyString, _replyCode, _replyLines, _replyString, ASCII_FILE_TYPE, BINARY_FILE_TYPE, BLOCK_TRANSFER_MODE, CARRIAGE_CONTROL_TEXT_FORMAT, COMPRESSED_TRANSFER_MODE, DEFAULT_CONTROL_ENCODING, DEFAULT_DATA_PORT, DEFAULT_PORT, EBCDIC_FILE_TYPE, FILE_STRUCTURE, LOCAL_FILE_TYPE, NON_PRINT_TEXT_FORMAT, PAGE_STRUCTURE, RECORD_STRUCTURE, REPLY_CODE_LEN, STREAM_TRANSFER_MODE, strictMultilineParsing, TELNET_TEXT_FORMAT
-
-
Constructor Summary
Constructors Constructor Description Client()
Public default constructor
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected Socket
__openPassiveDataConnection(int command, String arg)
Open a passive data connection socketvoid
disconnect()
Closes the connection to the FTP server and restores connection parameters to the default values.String
getSystemName()
Fetches the system type name from the server and returns the string.boolean
isRemoteVerificationEnabled()
Return whether or not verification of the remote host participating in data connections is enabled.boolean
login(String username, String password)
Login to the FTP server using the provided username and password.boolean
logout()
Logout of the FTP server by sending the QUIT command.void
retrieveFile(String path, OutputStream os, int limit)
retrieve file for pathvoid
retrieveList(String path, List<org.apache.commons.net.ftp.FTPFile> entries, int limit, org.apache.commons.net.ftp.FTPFileEntryParser parser)
Retrieve list reply for pathboolean
sendNoOp()
Sends a NOOP command to the FTP server.void
setDataTimeout(int timeout)
Sets the timeout in milliseconds to use for data connection.boolean
setFileType(int fileType)
Sets the file type to be transferred.void
setRemoteVerificationEnabled(boolean enable)
Enable or disable verification that the remote host taking part of a data connection is the same as the host to which the control connection is attached.-
Methods inherited from class org.apache.commons.net.ftp.FTP
__getReplyNoReport, __noop, _connectAction_, _connectAction_, abor, acct, allo, allo, allo, allo, appe, cdup, cwd, dele, eprt, epsv, feat, getCommandSupport, getControlEncoding, getReply, getReplyCode, getReplyString, getReplyStrings, help, help, isStrictMultilineParsing, isStrictReplyParsing, list, list, mdtm, mfmt, mkd, mlsd, mlsd, mlst, mlst, mode, nlst, nlst, noop, pass, pasv, port, pwd, quit, rein, rest, retr, rmd, rnfr, rnto, sendCommand, sendCommand, sendCommand, sendCommand, sendCommand, sendCommand, setControlEncoding, setStrictMultilineParsing, setStrictReplyParsing, site, size, smnt, stat, stat, stor, stou, stou, stru, syst, type, type, user
-
Methods inherited from class org.apache.commons.net.SocketClient
addProtocolCommandListener, applySocketAttributes, connect, connect, connect, connect, connect, connect, createCommandSupport, fireCommandSent, fireReplyReceived, getCharset, getCharsetName, getConnectTimeout, getDefaultPort, getDefaultTimeout, getKeepAlive, getLocalAddress, getLocalPort, getProxy, getReceiveBufferSize, getRemoteAddress, getRemotePort, getSendBufferSize, getServerSocketFactory, getSoLinger, getSoTimeout, getTcpNoDelay, isAvailable, isConnected, removeProtocolCommandListener, setCharset, setConnectTimeout, setDefaultPort, setDefaultTimeout, setKeepAlive, setProxy, setReceiveBufferSize, setSendBufferSize, setServerSocketFactory, setSocketFactory, setSoLinger, setSoTimeout, setTcpNoDelay, verifyRemote
-
-
-
-
Method Detail
-
__openPassiveDataConnection
protected Socket __openPassiveDataConnection(int command, String arg) throws IOException, FtpExceptionCanNotHaveDataConnection
Open a passive data connection socket- Parameters:
command
- the FTP command to be sent to the FTP serverarg
- the argument associated with the command- Returns:
- a passive
Socket
connections - Throws:
IOException
- if there is an error entering passive modeFtpExceptionCanNotHaveDataConnection
- can occur if there is a malformed server reply
-
setDataTimeout
public void setDataTimeout(int timeout)
Sets the timeout in milliseconds to use for data connection. set immediately after opening the data connection.- Parameters:
timeout
- maximum timeout in milliseconds
-
disconnect
public void disconnect() throws IOException
Closes the connection to the FTP server and restores connection parameters to the default values.- Overrides:
disconnect
in classorg.apache.commons.net.ftp.FTP
- Throws:
IOException
- If an error occurs while disconnecting.
-
setRemoteVerificationEnabled
public void setRemoteVerificationEnabled(boolean enable)
Enable or disable verification that the remote host taking part of a data connection is the same as the host to which the control connection is attached. The default is for verification to be enabled. You may set this value at any time, whether the FTPClient is currently connected or not.- Parameters:
enable
- True to enable verification, false to disable verification.
-
isRemoteVerificationEnabled
public boolean isRemoteVerificationEnabled()
Return whether or not verification of the remote host participating in data connections is enabled. The default behavior is for verification to be enabled.- Returns:
- True if verification is enabled, false if not.
-
login
public boolean login(String username, String password) throws IOException
Login to the FTP server using the provided username and password.- Parameters:
username
- The username to login under.password
- The password to use.- Returns:
- True if successfully completed, false if not.
- Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException
- If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.IOException
- If an I/O error occurs while either sending a command to the server or receiving a reply from the server.
-
logout
public boolean logout() throws IOException
Logout of the FTP server by sending the QUIT command.- Returns:
- True if successfully completed, false if not.
- Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException
- If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.IOException
- If an I/O error occurs while either sending a command to the server or receiving a reply from the server.
-
retrieveList
public void retrieveList(String path, List<org.apache.commons.net.ftp.FTPFile> entries, int limit, org.apache.commons.net.ftp.FTPFileEntryParser parser) throws IOException, FtpExceptionCanNotHaveDataConnection, FtpExceptionUnknownForcedDataClose, FtpExceptionControlClosedByForcedDataClose
Retrieve list reply for path- Parameters:
path
- a path on the FTP serverentries
- a initializedList
ofFTPFile
's to populate with entries found at the pathlimit
- optionally impose a download limit if this value is >= 0, otherwise no limitparser
- a configuredFTPFileEntryParser
- Throws:
IOException
- if there is a fatal I/O error, could be related to opening a passive data connection or retrieving data from the specified pathFtpExceptionCanNotHaveDataConnection
- if an error occurs whilst opening a passive data connectionFtpExceptionUnknownForcedDataClose
- if there is a bad reply from the FTP serverFtpExceptionControlClosedByForcedDataClose
- some ftp servers will close control channel if data channel socket is closed by our end before all data has been read out
-
retrieveFile
public void retrieveFile(String path, OutputStream os, int limit) throws IOException, FtpExceptionCanNotHaveDataConnection, FtpExceptionUnknownForcedDataClose, FtpExceptionControlClosedByForcedDataClose
retrieve file for path- Parameters:
path
- a path on the FTP serveros
- anOutputStream
to write data tolimit
- optionally impose a download limit if this value is >= 0, otherwise no limit- Throws:
IOException
- if there is a fatal I/O error, could be related to opening a passive data connection or retrieving data from the specified pathFtpExceptionCanNotHaveDataConnection
- if an error occurs whilst opening a passive data connectionFtpExceptionUnknownForcedDataClose
- if there is a bad reply from the FTP serverFtpExceptionControlClosedByForcedDataClose
- some ftp servers will close control channel if data channel socket is closed by our end before all data has been read out
-
setFileType
public boolean setFileType(int fileType) throws IOException
Sets the file type to be transferred. This should be one ofFTP.ASCII_FILE_TYPE
,FTP.IMAGE_FILE_TYPE
, etc. The file type only needs to be set when you want to change the type. After changing it, the new type stays in effect until you change it again. The default file type isFTP.ASCII_FILE_TYPE
if this method is never called.- Parameters:
fileType
- The_FILE_TYPE
constant indcating the type of file.- Returns:
- True if successfully completed, false if not.
- Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException
- If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.IOException
- If an I/O error occurs while either sending a command to the server or receiving a reply from the server.
-
getSystemName
public String getSystemName() throws IOException, FtpExceptionBadSystResponse
Fetches the system type name from the server and returns the string. This value is cached for the duration of the connection after the first call to this method. In other words, only the first time that you invoke this method will it issue a SYST command to the FTP server. FTPClient will remember the value and return the cached value until a call to disconnect.- Returns:
- The system type name obtained from the server. null if the information could not be obtained.
- Throws:
FtpExceptionBadSystResponse
- indicating bad reply of SYST commandIOException
- If an I/O error occurs while either sending a command to the server or receiving a reply from the server.
-
sendNoOp
public boolean sendNoOp() throws IOException
Sends a NOOP command to the FTP server. This is useful for preventing server timeouts.- Returns:
- True if successfully completed, false if not.
- Throws:
org.apache.commons.net.ftp.FTPConnectionClosedException
- If the FTP server prematurely closes the connection as a result of the client being idle or some other reason causing the server to send FTP reply code 421. This exception may be caught either as an IOException or independently as itself.IOException
- If an I/O error occurs while either sending a command to the server or receiving a reply from the server.
-
-