Allow defining custom redirects with `WHOOGLE_REDIRECTS`

Redirects to alternative frontends can now be defined using the
WHOOGLE_REDIRECTS environment variable. Usage is documented in the
readme, but is basically defined as <parent>:<new>.

Closes #988
main
Ben Busby 2023-05-19 12:15:15 -06:00
parent f213a2a64a
commit f65529f328
No known key found for this signature in database
GPG Key ID: B9B7231E01D924A1
3 changed files with 44 additions and 4 deletions

View File

@ -33,10 +33,11 @@ Contents
5. [Usage](#usage)
6. [Extra Steps](#extra-steps)
1. [Set Primary Search Engine](#set-whoogle-as-your-primary-search-engine)
2. [Prevent Downtime (Heroku Only)](#prevent-downtime-heroku-only)
3. [Manual HTTPS Enforcement](#https-enforcement)
4. [Using with Firefox Containers](#using-with-firefox-containers)
5. [Reverse Proxying](#reverse-proxying)
2. [Custom Redirecting](#custom-redirecting)
3. [Prevent Downtime (Heroku Only)](#prevent-downtime-heroku-only)
4. [Manual HTTPS Enforcement](#https-enforcement)
5. [Using with Firefox Containers](#using-with-firefox-containers)
6. [Reverse Proxying](#reverse-proxying)
1. [Nginx](#nginx)
7. [Contributing](#contributing)
8. [FAQ](#faq)
@ -401,6 +402,7 @@ There are a few optional environment variables available for customizing a Whoog
| WHOOGLE_USER_AGENT | The desktop user agent to use. Defaults to a randomly generated one. |
| WHOOGLE_USER_AGENT_MOBILE | The mobile user agent to use. Defaults to a randomly generated one. |
| WHOOGLE_USE_CLIENT_USER_AGENT | Enable to use your own user agent for all requests. Defaults to false. |
| WHOOGLE_REDIRECTS | Specify sites that should be redirected elsewhere. See (custom redirecting)(#custom-redirecting). |
| EXPOSE_PORT | The port where Whoogle will be exposed. |
| HTTPS_ONLY | Enforce HTTPS. (See [here](https://github.com/benbusby/whoogle-search#https-enforcement)) |
| WHOOGLE_ALT_TW | The twitter.com alternative to use when site alternatives are enabled in the config. Set to "" to disable. |
@ -451,6 +453,7 @@ Same as most search engines, with the exception of filtering by time range.
To filter by a range of time, append ":past <time>" to the end of your search, where <time> can be `hour`, `day`, `month`, or `year`. Example: `coronavirus updates :past hour`
## Extra Steps
### Set Whoogle as your primary search engine
*Note: If you're using a reverse proxy to run Whoogle Search, make sure the "Root URL" config option on the home page is set to your URL before going through these steps.*
@ -495,6 +498,32 @@ Browser settings:
- Manual
- Under search engines > manage search engines > add, manually enter your Whoogle instance details with a `<whoogle url>/search?q=%s` formatted search URL.
### Custom Redirecting
You can set custom site redirects using the `WHOOGLE_REDIRECTS` environment
variable. A lot of sites, such as Twitter, Reddit, etc, have built-in redirects
to [Farside links](https://sr.ht/~benbusby/farside), but you may want to define
your own.
To do this, you can use the following syntax:
```
WHOOGLE_REDIRECTS="<parent_domain>:<new_domain>"
```
For example, if you want to redirect from "badsite.com" to "goodsite.com":
```
WHOOGLE_REDIRECTS="badsite.com:goodsite.com"
```
This can be used for multiple sites as well, with comma separation:
```
WHOOGLE_REDIRECTS="badA.com:goodA.com,badB.com:goodB.com"
```
NOTE: Do not include "http(s)://" when defining your redirect.
### Prevent Downtime (Heroku only)
Part of the deal with Heroku's free tier is that you're allocated 550 hours/month (meaning it can't stay active 24/7), and the app is temporarily shut down after 30 minutes of inactivity. Once it becomes inactive, any Whoogle searches will still work, but it'll take an extra 10-15 seconds for the app to come back online before displaying the result, which can be frustrating if you're in a hurry.

View File

@ -79,3 +79,10 @@ def get_abs_url(url, page_url):
elif url.startswith('./'):
return f'{page_url}{url[2:]}'
return url
def list_to_dict(lst: list) -> dict:
if len(lst) < 2:
return {}
return {lst[i].replace(' ', ''): lst[i+1].replace(' ', '')
for i in range(0, len(lst), 2)}

View File

@ -1,5 +1,6 @@
from app.models.config import Config
from app.models.endpoint import Endpoint
from app.utils.misc import list_to_dict
from bs4 import BeautifulSoup, NavigableString
import copy
from flask import current_app
@ -43,6 +44,9 @@ SITE_ALTS = {
'quora.com': os.getenv('WHOOGLE_ALT_QUORA', 'farside.link/quetre')
}
# Include custom site redirects from WHOOGLE_REDIRECTS
SITE_ALTS.update(list_to_dict(re.split(',|:', os.getenv('WHOOGLE_REDIRECTS', ''))))
def contains_cjko(s: str) -> bool:
"""This function check whether or not a string contains Chinese, Japanese,