Preserve wikipedia language setting for wikiless redirects

Wikipedia -> Wikiless redirects always result in an english language
result, even if the Wikipedia result would've been in a non-english
language. This is due to Wikipedia using language specific subdomains
(i.e. de.wikipedia.org, en.wikipedia.org, etc) whereas Wikiless uses a
"lang" url param.

This has been fixed by inspecting the subdomain of the wikipedia link
and passing that value to Wikiless as the lang param if it's determined
to be a language specific value (currently just looking for a 2-char
subdomain).

See #805
main
Ben Busby 2022-07-06 09:49:43 -06:00
parent 7164d066c3
commit f688b88bd8
No known key found for this signature in database
GPG Key ID: B9B7231E01D924A1
1 changed files with 10 additions and 1 deletions

View File

@ -134,7 +134,16 @@ def get_site_alt(link: str) -> str:
if not hostname or site_key not in hostname or not SITE_ALTS[site_key]: if not hostname or site_key not in hostname or not SITE_ALTS[site_key]:
continue continue
link = link.replace(hostname, SITE_ALTS[site_key]) # Wikipedia -> Wikiless replacements require the subdomain (if it's
# a 2-char language code) to be passed as a URL param to Wikiless
# in order to preserve the language setting.
url_params = ''
if 'wikipedia' in hostname:
subdomain = hostname.split('.')[0]
if len(subdomain) == 2:
url_params = f'?lang={subdomain}'
link = link.replace(hostname, SITE_ALTS[site_key]) + url_params
for prefix in SKIP_PREFIX: for prefix in SKIP_PREFIX:
link = link.replace(prefix, '//') link = link.replace(prefix, '//')
break break