Lugaru's Epsilon
Programmer's
Editor

Context:
Epsilon User's Manual and Reference
   Primitives and EEL Subroutines
      File Primitives
         . . .
         Manipulating File Names
         Internet Primitives
            Parsing URLs
         Tagging Internals

Previous   Up    Next
Internet Primitives  Primitives and EEL Subroutines   Tagging Internals


Epsilon User's Manual and Reference > Primitives and EEL Subroutines > File Primitives > Internet Primitives >

Parsing URLs

prepare_url_operation(char *file, int op, struct url_parts *parts)
get_password(char *res, char *host, char *usr)
int parse_url(char *url, struct url_parts *p)
int divide_url(char *url, struct url_parts *p)

Several subroutines handle parsing URLs into their component parts. These parts are stored in a url_parts structure, which has fields for a URL's service (http, ftp, and so forth), host name, port, user name if any, password if any, and the "file name": the final part of a URL, that may be a file name, a web page name or something else. Since an empty user name or password is legal, but is different from an omitted one, there are also fields to specify if each of these is present.

The prepare_url_operation( ) subroutine parses a URL and fills one of these structures. It complains if it doesn't recognize the service name, or if the service is something other than FTP but the operation isn't reading. The operation code is one of those used with the ftp_op( ) subroutine described in Internet Primitives. For example, it complains if you try to perform an FTP_LIST operation with a telnet:// URL. It also prompts for a password if necessary, and saves the password for later use, by calling the get_password( ) subroutine.

The get_password( ) subroutine gets the password for a particular user/host combination. Specify the user and host, and the subroutine will fill in the provided character array res with the password. The first time it will prompt the user for the information; it will then store the information and return it without prompting in future requests. The subroutine is careful to make sure the password never appears in a state file or session file. To discard a particular remembered password, pass NULL as the first parameter. The next time get_password( ) is asked for the password of that user on that host, it will prompt the user again.

The prepare_url_operation( ) subroutine calls the parse_url( ) subroutine to actually parse the URL into a url_parts structure. The latter returns zero if the URL is invalid, or nonzero if it appears to be legal.

The divide_url( ) subroutine is similar to parse_url( ), but doesn't divide the host section into its component parts. Like parse_url( ), it returns zero if the URL is invalid, or nonzero if it appears to be legal.

For example, given the URL scp://bob:secret%2Fcode@example.com:1022/path/to/file, both divide_url( ) and parse_url( ) set the service member of the url_parts structure to "scp" and the fname member to "path/to/file".

But divide_url( ) then sets the host member to "bob:secret%2Fcode@example.com:1022", whereas parse_url( ) sets host to "example.com", port to "1022", usr to "bob", and pwd to "secret/code", also setting the have_password and have_usr members nonzero since the URL specified both. Notice that parse_url( ) decodes any %-escaped sequences in the user name or password sections, changing %2F to / in this example.

int split_string(char *part1, char *cs, char *part2)
int reverse_split_string(char *part1, char *cs, char *part2)

The parse_url( ) subroutine uses two helper subroutines. The split_string( ) subroutine divides a string part1 into two parts, by searching it for one of a set of delimiter characters cs. It finds the first character in part1 that appears in cs. Then it copies the remainder of part1 to part2, and removes the delimiter character and the remainder from part1. It returns the delimiter character it found. If no delimiter character appears in part1, it sets part2 to "" and returns 0. The reverse_split_string( ) subroutine is almost identical; it just searches through part1 from the other end, and splits the string at the last character in part1 that appears in cs.

char *get_url_file_part(char *url, int sep)

The get_url_file_part( ) subroutine helps to parse URLs. It takes a URL and returns a pointer to a position within it where its file part begins. For example, in the URL http://www.lugaru.com/why-lugaru.html, the subroutine returns a pointer to the start of "why". If sep is nonzero, the subroutine instead returns a pointer to the / just before "why". If its parameter is not a URL, the subroutine returns a pointer to its first character.



Previous   Up    Next
Internet Primitives  Primitives and EEL Subroutines   Tagging Internals


Lugaru Copyright (C) 1984, 2012 Lugaru Software Ltd. All Rights Reserved.