Skip to content

urllib.error.HTTPError: HTTP Error 404: Not Found #1

@AlanTanKX

Description

@AlanTanKX

I have downloaded the files and created a new environment using the provided "environment.yml" file successfully.

However, I got an error message when running the following in the terminal: python download.py --years 2023 --data_dir .

The error message is below:

(patents) C:\Users\alant>python download.py --years 2023 --data_dir .
Preparing to download all USPTO patents from 2023 ...
Found 18 releases from 2023
Directory for 2023 already exists.
Directory for 2023\I20230103 already exists.
2023\I20230103.tar: 0.00B [00:07, ?B/s]
Traceback (most recent call last):
File "C:\Users\alant\download.py", line 160, in
main(args)
File "C:\Users\alant\download.py", line 134, in main
download_url(
File "C:\Users\alant\download.py", line 73, in download_url
urllib.request.urlretrieve(url, filename=output_path, reporthook=t.update_to)
File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 239, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 523, in open
response = meth(req, response)
File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 632, in http_response
response = self.parent.error(
File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 561, in error
return self._call_chain(*args)
File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 494, in _call_chain
result = func(*args)
File "C:\Users\alant\anaconda3\envs\patents\lib\urllib\request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

The URL that is in the code for download.py is correct and can be accessed through the browser, so I am confused as to why this error message was raised. This is the URL: https://bulkdata.uspto.gov/data/patent/grant/redbook/2023/. I got the same error message when running the code for other years.

Thanks very much for the help in troubleshooting this issue!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions