Skip to content Skip to sidebar Skip to footer

Python Find Question

I am using Python to extract the filename from a link using rfind like below: url = 'http://www.google.com/test.php' print url[url.rfind('/') +1 : ] This works ok with links with

Solution 1:

Just removing the slash at the end won't work, as you can probably have a URL that looks like this:

http://www.google.com/test.php?filepath=tests/hey.xml

...in which case you'll get back "hey.xml". Instead of manually checking for this, you can use urlparse to get rid of the parameters, then do the check other people suggested:

from urlparse import urlparse
url = "http://www.google.com/test.php?something=heyharr/sir/a.txt"
f = urlparse(url)[2].rstrip("/")
print f[f.rfind("/")+1:]

Solution 2:

Use [r]strip to remove trailing slashes:

url.rstrip('/').rsplit('/', 1)[-1]

If a wider range of possible URLs is possible, including URLs with ?queries, #anchors or without a path, do it properly with urlparse:

path= urlparse.urlparse(url).pathreturnpath.rstrip('/').rsplit('/', 1)[-1] or'(root path)'

Solution 3:

Filenames with a slash at the end are technically still path definitions and indicate that the index file is to be read. If you actually have one that' ends in test.php/, I would consider that an error. In any case, you can strip the / from the end before running your code as follows:

url = url.rstrip('/')

Solution 4:

There is a library called urlparse that will parse the url for you, but still doesn't remove the / at the end so one of the above will be the best option

Solution 5:

Just for fun, you can use a Regexp:

import re
print re.search('/([^/]+)/?$', url).group(1)

Post a Comment for "Python Find Question"