Skip to content

As a data consumer i want the /check-url endpoint to accurately cache results #875

@mojomonger

Description

@mojomonger

when i run the check-url endpoint with the following endpoint:

https://archive.org/services/context/iari/v2/check-url?url=https://web.archive.org/web/20170726234423/https://minnesotastreetproject.com/exhibitions/1275-minnesota-st/internet-archive%E2%80%99s-2017-artist-residence-exhibition

it does not have the "teastdeadlink_status_code" property in the returned results. This indicates that something is wrong with the caching process, as a previous fetch with /check-url was done with the "refresh=true" flag set.

first_level_domain: "archive.org",
fld_is_ip: false,
url: "[https://web.archive.org/web/20170726234423/https://minnesotastreetproject.com/exhibitions/1275-minnesota-st/internet-archive’s-2017-artist-residence-exhibition](https://web.archive.org/web/20170726234423/https://minnesotastreetproject.com/exhibitions/1275-minnesota-st/internet-archive%E2%80%99s-2017-artist-residence-exhibition)",
fixed_url: "",
scheme: "https",
netloc: "web.archive.org",
tld: "org",
unrecognized_tld_length: false,
added_http_scheme_worked: false,
malformed_url: false,
malformed_url_details: null,
request_error: false,
request_error_details: "",
dns_record_found: true,
dns_no_answer: false,
dns_error: false,
status_code: 200,
timeout: 60,
dns_error_details: "",
response_headers: {},
timestamp: 1682018676,
isodate: "2023-04-20T19:24:36.221801",
id: "bbfeb6dd",
served_from_cache: true

when the check-url is run with refresh=true, a 500 error occurs:

https://archive.org/services/context/iari/v2/check-url?refresh=true&url=https://web.archive.org/web/20170726234423/https://minnesotastreetproject.com/exhibitions/1275-minnesota-st/internet-archive%E2%80%99s-2017-artist-residence-exhibition

returns:

Internal Server Error
The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.

It appears something is going wrong with the processing of this url when refresh=true

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

Status

PR ready for review/test

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions